swiss knife blog: 2013

Saturday, August 24, 2013

JavaScript / JQuery Survival Kit

JQuery

$(document).ready(function(){

alert( "ready!" ); });

Wednesday, February 6, 2013

bash survival kit

Useful commands
- disk activity per process and thread, IO monitor, disk access including swapping
- iotop -o
- lsof (List files/network connections open by user)
--lsof -i:80 (processes listening on port 80) - see also lsof description
- dmesg |less
- uname -a
- du -sh * (Folders size)
- du -sh * | sort -nr | head -10 (biggest subfolders)
- ls -l | grep '^d' (list only directories)
- watch 'ps aux | grep someprocess' (execute periodically a command)
- ssh -X toto@machine (enables X11)
- ajust time and start ntp date tracking:
- - /etc/init.d/ntp stop
- - ntpdate 0.debian.pool.ntp.org (synchronize computers time)
- - /etc/init.d/ntp start
- - see /etc/ntp.conf - to kill all XServer sessions: "sudo killall Xorg"
- tunnel: ssh -f -L 10.193.129.2:9090:192.168.0.202:8080 192.168.50.99
- sudo fuser -v 12001/tcp (check which process/user is using a port)
- nslookup: get the name from a remote machine from its IP
- check if a remote machine port is open:
-- wget -qS -O- http://csimg.toto.me:80
-- curl http://csimg.toto.me:80
find my public IP address

curl ifconfig.me

when was last shutdown on Ubuntu

sudo journalctl -b -1

In bash, you can repeat all arguments from the previous command using:
!* - Gets all arguments from the previous command
route (to see the gateway)
open ports

netstat -ltu (local)
nmap (not showing all open ports, can be remote)
sudo nmap -sU -p 500 192.168.1.254 (check internet key exchange port open, used for vpn)

most recent java files in the tree

find . |grep "\.java" |xargs ls -l|awk -F$' ' '{print $6 " " $7 " " $9}' |sort

swap per process

for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done

find

find . -name pom.xml | xargs grep 3.5 --color (emacs/dired to replace)
find . -name '*.log.*' -delete
find . | xargs wc (count the number of files in a folder and sub-directories)
find . -name "edu-*.json" -exec bash -c 'mv "$1" "${1/edu-/edu-sea-}"' _ {} \;
rename all edu-* into edu-sea-*

grep

grep -o OTHERS tb-201610.sql |wc (new line)
grep -P "\t8" file1.txt |awk -F$'\t' '{print $1}' |while read -r x; do grep "$x" file2.txt; done
look for multiple values

egrep "126|127|128|129|130|131|132|133" toto.txt

System

sudo bash (remain as root)
uname -a
less /proc/meminfo
less /proc/cpuinfo
lspci -v | less (list all PCI devices connected to the PCI bus)
dmesg | less (kernel boot messages)
ls -1R | wc -l (number of files in a folder)
fdisk -l (disks on a machine)
dmidecode (physical processor info)
ps auxf (process tree),
ps -efj (ppid)
ppid: use htop F2
uptime (when did the machine restarted)

DNS

dig cnn.com

load CPU:

look for a string in the 28th colon in a tsv:

awk -F$'\t' '{if($28 == "popcorn") print $0;}' toto.tsv
awk -F$'\t' '{print $18}' wm_marchand_Data.txt | sort -u
awk -F$'\t' '{print $2 "\t" $1}' sb.txt

for special caracters:
iconv -f ISO-8859-1 -t utf8 ri_thesaurus_v2_Data_LG161___.txt > ri_thesaurus_v2_Data_LG161____.txt
iconv -f utf8 -t ISO-8859-1//TRANSLIT lbm-ko-full-201806.csv > lbm-ko-full-201806-ISO8859.csv

keep numbers after = in the output

awk -F$' ' '{print $11}' noc.log | sed -r 's/.*=([0-9]+)/\1/g'

keep only numbers

awk -F':' '{print $3}' 1300-siren.txt | sed -r 's/[^0-9]//g'

sort by full number

awk -F$' ' '{print $11}' noc.log |sed -r 's/.*=([0-9]+)/\1/g' | sort -k1,1n > noc-out.log

grep -P "\t8" file1.txt |awk -F$'\t' '{print $1}' |while read -r x; do grep "$x" file2.txt; done

to find out swap

for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done| awk -F$' ' '{print $2}' |sort -k1,1n

.
.

Thursday, January 17, 2013

Elastic Search Survival Kit

Each Lucene segment has its own cache. So indexing is not affecting too much search performances
every node is a "master", everybody indexes and everybody searches
To use kibana (http://127.0.0.1:5601/app/kibana#/dev_tools)
to be able to access kibana from remote, change kibana.yml:

server.host: "0.0.0.0"

PUT /fb
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer" : "whitespace",
"filter": ["lowercase", "stop"]
}
}
}
},
"mappings": {
"pages": {
"properties": {
"about": {
"type": "text",
"fielddata": true,
"analyzer": "my_analyzer"
}
}
}
}
}

----- term facets

GET /fb/pages/_search
{
"size": 1,
"aggs" : {
"group_by_text_term" : {
"terms" : {
"field" : "about",
"size":30
}
}
}
}

----------- facets on a category (after v5, text columns are also keyword columns)
GET /fb2/pages/_search
{
"size": 1,
"query": {
"match_all": {}
},
"aggs": {
"group_by_cat": {
"terms": {
"field": "category.keyword",
"size": 10
}
}
}
}

------------- delete all docs
POST /fb2/_delete_by_query
{
"query": {
"match_all": {}
}
}

------------- to get more details of what is happening in a query

POST antoine-try-iso2/_search

{

"profile": true,

"_source": ["s_alpha.denom"],

"query": {

[...]

------------- http Elastic queries

http://127.0.0.1:9200/_cat/indices

http://localhost:9200/_search?q=lastname:Bond

http://127.0.0.1:9200/pj-search-index-1-6-test/_search

------------- search templates

GET _cluster/state/metadata?pretty&filter_path=**.stored_scripts

------------- versions

Elasticsearch 2.4.0 based on Lucene 5.5.2 (08/2016)
(ES version hop 2.4 to 5.0)

Elasticsearch 5.0 (11/2016)
Elasticsearch 5.5.1, based on Lucene 6.5.1 (07/2017)

Elasticsearch 5.6.3, based on Lucene 6.6.3 (07/2017)

Elasticsearch 6.0.0 beta (05/2017)   based on Lucene 7.0.0
Elasticsearch 6.7.1 beta (04/2019)   based on Lucene 7.7.1
Elasticsearch 7.0.0 beta (04/2019)   based on Lucene 8.0.0

In git sources:

vi buildSrc/version.properties

elasticsearch = 5.6.3
lucene = 6.6.1
.

Tuesday, January 8, 2013

Hadoop / Cloudera survival kit

----- Debian
add this in /etc/apt/sources.list:

deb http://archive.cloudera.com/cdh4/debian/squeeze/amd64/cdh/ squeeze-cdh4.1.2 contrib

then you can do:

apt-get update
apt-get install hadoop

and then, things like:

hadoop fs -ls hdfs://192.168.0.135:8020/

----- Ubuntu:
add to /etc/apt/sources.list

deb [arch=amd64] http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4 contrib
deb-src http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh precise-cdh4 contrib

curl -s http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/archive.key | sudo apt-key add -
apt-get update
sudo apt-get install hbase-master

[toto@vv182 ~]$ echo "scan 'offers', {LIMIT => 10, STARTROW => 'se|000029098138', ENDROW => 'se|0000291'}" |hbase shell

[toto@vv182 ~]$ echo "get 'offers','fr|000000002138|0000016418701245'" |hbase shell

[toto@vv182 ~]$ hadoop fs -cat /user/nomad/pipeline/delta_offers/my-file.txt

swiss knife blog