Getting git experiences

Had to get a file contents dynamically from the AWS codecommit repo and supply those contents as an array list to the groovy shell in a jenkins job. This file, belongs to an AWS codecommit repo, is so small and just has some info as one word per line. Those items are related to the tags (modules) which are defined in the robot file destined for automation.

I can have the file put in the jenkins server manually and read it from the location from the Jenkins’s groovy shell. But that file might be modified and/or contents might be added in the due course whenever new modules get added for automation. The contents of that file will be shown as the checkbox items whenever you hit the jenkins job’s ‘build with parameters’ button to trigger the job and user has to supply those checkbox items (selecting one or many) as inputs to the jenkins job. So, getting that file from the codecommit repo dynamically whenever the Jenkins job runs is inevitable, otherwise managing the file manually would be costly laborious affair.

Well, that’s the background story. So, getting or downloading or checking out that module list items file from the codecommit repo without cloning the entire repo is the initial step. I can’t use the SCM step (which clones the entire repo) for the automation job because we will have to show the items as checkbox items before cloning the repo. Hope you got it.

Know that there are couple of ways to checkout a singe file from the repo by using git sparse checkout or git archives or some other ways. Even one can use wget on the git raw content url (for example, like, https://raw.githubusercontent.com/ambatigan/list-items/master/items_list.txt) if it is github repo. But this is AWS Codecommit repo. They will not provide the raw content urls like github. AWS provides some API calls for getting the blob of the file content though, but they should be authenticated first.

If you have svn installed in your server, you can also try ‘svn export’ on the github url. (for example, svn export https://github.com/ambatigan/list-items.git/trunk/items_list.txt). But this approach for the AWS codecommit needs the credentials to be supplied in the command itself.

Discovered that in the AWS CLI for codecommit (for the latest CLI version, 1.16.x ONLY), they introduced get-file subcommand for aws codecommit. The response of this get-file command output gives the fileContent in the base64 encoding. And you can use the default base64 decoder in the linux server, to decode it back to the original content.

For example, the response to the get-file command (ex: aws codecommit get-file –repository-name Testing-Automation –file-path /Jenkins/Dev/Resources/Input_data/tag_names.txt) is as follows:

[ec2-user@ip-60-0-1-94 ~]$ aws codecommit get-file –repository-name Testing-Automation –file-path /Jenkins/Dev/Resources/Input_data/tag_names.txt
{
“filePath”: “Jenkins/Dev/Resources/Input_data/tag_names.txt”,
“blobId”: “a6c7ac16cf059e739c3ad50efc2375d95feea03c”,
“commitId”: “e2a51016ce9b3f281504257124e6b6d72d3e338e”,
“fileSize”: 26,
“fileContent”: “QWxsCkxvZ2luClRyZW5kcwpTaXRlX21hcAo=”,
“fileMode”: “NORMAL”
}

And we can decode it like –

[ec2-user@ip-60-0-1-94 ~]$ echo QWxsCkxvZ2luClRyZW5kcwpTaXRlX21hcAo= | base64 -d
All
Login
Trends
Site_map
[ec2-user@ip-60-0-1-94 ~]$

Summary – echo ` aws codecommit get-file –repository-name Testing-Automation –file-path /Jenkins/Dev/Resources/Input_data/tag_names.txt|jq -r ‘.fileContent’`|base64 -d

Refer to https://github.com/ambatigan/list-items – to know how I implemented this core concept into the groovy shell in the Jenkins job.

disk space

Have you encountered a situation like you see your root partition gets full (100%) but you see nothing needs to be cleaned up. In other words, the df output shows 100% or some thing huge disk space at a particular folder but the ‘du’ output shows it was not that much huge?

For example, I encountered one such situation in one of our production hosts, where the root partition got full with 100%..but the ‘du’ command on all the folders under that partition showed me nothing that much used space.

It lead me to run the following lsof command to investigate into…

undisclosed-host:/ # lsof|grep -i delete
….
….
lrthdf5   55870  root    1w      REG                8,3 13876053553    3278175 /var/log/lrthdf5/onl_dev_bin_mc3_dev.log (deleted)
lrthdf5   56098  root    2w      REG                8,3 13881587249    3278181 /var/log/lrthdf5/onl_replay_mc3replay.log (deleted)

From the above ‘lsof’ output, we find the processes with the pids 55870 and 56098 have kept the files /var/log/lrthdf5/onl_dev_bin_mc3_dev.log and /var/log/lrthdf5/onl_replay_mc3replay.log as open with the corresponding file descriptors (fd) mentioned..

It seemed to me like, as the part of a maintenance process, somebody deleted those log files while the files are being written by those processes in action.

As these files have been identified, I did free space occupied by those file by shutting down the processes in question.

Before:

undisclosed-host:/ # df -Th
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda3 ext3 63G 60G 384M 100% /
tmpfs tmpfs 16G 0 16G 0% /sys/fs/cgroup
udev tmpfs 16G 212K 16G 1% /dev
tmpfs tmpfs 16G 0 16G 0% /dev/shm
/dev/sdb1 xfs 5.0T 3.7T 1.3T 74% /mnt/sdb1

After shutting down the process (the used space got down from 100% to 78% ):

undisclosed-host:/ # stpcap onl_replay mc3replay
Fri Feb 9 00:18:09 PST 2018: Stopping capture processes. Please wait, it may take a while…

undisclosed-host:/ # df -Th
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda3 ext3 63G 47G 14G 78% /
tmpfs tmpfs 16G 0 16G 0% /sys/fs/cgroup
udev tmpfs 16G 212K 16G 1% /dev
tmpfs tmpfs 16G 0 16G 0% /dev/shm
/dev/sdb1 xfs 5.0T 3.7T 1.3T 74% /mnt/sdb1

Alternatively, it is possible to force the system to de-allocate the space consumed by an in-use file by forcing the system to truncate the file via the proc file system. But it is more advanced. Not needed in my case though..

/proc is cool …….

Have you ever peeped into the /proc directory in a Linux system?
I believe it is one magical directory which can tell us a couple of things as transparent especially if you are debugging some problems related to networking and performance.
I do check this directory quite often whenever I’m in Operations attire. 🙂 But a while ago i looked into /proc while debugging a Hyperledger Fabric Blockchain related smart contract deployment and interactions with the network peers and orders. So i thought of sharing some basic tips regarding this aspect though most of the senior guys already know about it but they can also refresh their memories here… 🙂

Would like to demonstrate some of those couple of things here. Let me grab the pid of a process called ‘peer’ in my hyperledger fabric linux host

root@blk_chain_hlf1:/home/ganga# pgrep peer
23709

And we are gonna look for it under /proc directory..

root@blk_chain_hlf1:/home/ganga# ls -l /proc/23709/
total 0
dr-xr-xr-x 2 root root 0 Oct 27 22:53 attr
….
-r–r–r– 1 root root 0 Oct 27 22:53 cgroup
–w——- 1 root root 0 Oct 27 22:53 clear_refs
-r–r–r– 1 root root 0 Oct 27 01:05 cmdline
-rw-r–r– 1 root root 0 Oct 27 22:53 comm
-rw-r–r– 1 root root 0 Oct 27 22:53 coredump_filter
-r–r–r– 1 root root 0 Oct 27 22:53 cpuset
lrwxrwxrwx 1 root root 0 Oct 27 22:53 cwd -> /opt/gopath/src/github.com/hyperledger/fabric/peer
-r——– 1 root root 0 Oct 27 22:53 environ
lrwxrwxrwx 1 root root 0 Oct 27 01:05 exe -> /usr/local/bin/peer
dr-x—— 2 root root 0 Oct 26 15:31 fd
dr-x—— 2 root root 0 Oct 27 22:53 fdinfo
……
……
-r–r–r– 1 root root 0 Oct 27 22:53 wchan
root@blk_chain_hlf1:/home/ganga#

Let’s start looking at the file descriptors. You know file descriptors are the files which are opened by the program..

root@blk_chain_hlf1:/home/ganga# ls -l /proc/23709/fd
total 0
lr-x—— 1 root root 64 Oct 26 15:31 0 -> pipe:[4502095]
l-wx—— 1 root root 64 Oct 26 15:31 1 -> pipe:[4502096]
l-wx—— 1 root root 64 Oct 27 22:56 10 -> /var/hyperledger/production/ledgersData/chains/index/000001.log
lrwx—— 1 root root 64 Oct 27 22:56 11 -> /var/hyperledger/production/ledgersData/stateLeveldb/LOCK
…….
l-wx—— 1 root root 64 Oct 27 22:56 16 -> /var/hyperledger/production/ledgersData/historyLeveldb/LOG
l-wx—— 1 root root 64 Oct 27 22:56 17 -> /var/hyperledger/production/ledgersData/historyLeveldb/MANIFEST-000000
l-wx—— 1 root root 64 Oct 27 22:56 18 -> /var/hyperledger/production/ledgersData/historyLeveldb/000001.log
lrwx—— 1 root root 64 Oct 27 22:56 23 -> /var/hyperledger/production/ledgersData/chains/chains/myc/blockfile_000000
lrwx—— 1 root root 64 Oct 27 22:56 24 -> socket:[4503236]
……….
lrwx—— 1 root root 64 Oct 27 22:56 7 -> /var/hyperledger/production/ledgersData/chains/index/LOCK
l-wx—— 1 root root 64 Oct 27 22:56 8 -> /var/hyperledger/production/ledgersData/chains/index/LOG
l-wx—— 1 root root 64 Oct 27 22:56 9 -> /var/hyperledger/production/ledgersData/chains/index/MANIFEST-000000

Another thing we can do is to take a look at under exe which tells us which executable this program is running..

root@blk_chain_hlf1:/home/ganga# ls -l /proc/23709/exe
lrwxrwxrwx 1 root root 0 Oct 27 01:05 /proc/23709/exe -> /usr/local/bin/peer
root@blk_chain_hlf1:/home/ganga#

Um. We will look at /cmdline and we can cat that and see which command it is using.. Cool.

root@blk_chain_hlf1:/home/ganga# ls -l /proc/23709/cmdline
-r–r–r– 1 root root 0 Oct 27 01:05 /proc/23709/cmdline
root@blk_chain_hlf1:/home/ganga# cat /proc/23709/cmdline
peernodestart–peer-chaincodedev=true-oorderer:7050

One more thing we can see is its environment variables…Like..

root@blk_chain_hlf1:/home/ganga# cat /proc/23709/environ
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binHOSTNAME=95b88453a1acCORE_PEER_ID=peerCORE_PEER_GOSSIP_EXTERNALENDPOINT=peer:7051CORE_LOGGING_LEVEL=DEBUGCORE_PEER_LOCALMSPID=DEFAULTCORE_PEER_ADDRESS=peer:7051CORE_VM_ENDPOINT=unix:///host/var/run/docker.sockCORE_PEER_MSPCONFIGPATH=/etc/hyperledger/mspFABRIC_CFG_PATH=/etc/hyperledger/fabricHOME=/root

The content of the environ could be dumped out like above.. but no worries, we can make it more readble by adding a newline to each of those entries like –
root@blk_chain_hlf1:/home/ganga# cat /proc/23709/environ | tr ‘\0’ ‘\n’
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=95b88453a1ac
CORE_PEER_ID=peer
CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer:7051
CORE_LOGGING_LEVEL=DEBUG
CORE_PEER_LOCALMSPID=DEFAULT
CORE_PEER_ADDRESS=peer:7051
CORE_VM_ENDPOINT=unix:///host/var/run/docker.sock
CORE_PEER_MSPCONFIGPATH=/etc/hyperledger/msp
FABRIC_CFG_PATH=/etc/hyperledger/fabric
HOME=/root

Really cool……