Saturday, August 24, 2013

JavaScript / JQuery Survival Kit


  • JQuery
$(document).ready(function(){
alert( "ready!" ); }); 

Wednesday, February 6, 2013

bash survival kit

  • Useful commands 
    - disk activity per process and thread, IO monitor, disk access including swapping
    - iotop -o
    - lsof (List files/network connections open by user)
    --lsof -i:80 (processes listening on port 80) - see also lsof description 
    - dmesg |less
    - uname -a
    - du -sh * (Folders size)
    - du -sh * | sort -nr | head -10 (biggest subfolders)
    - ls -l | grep '^d' (list only directories)
    - watch 'ps aux | grep someprocess' (execute periodically a command)
    - ssh -X toto@machine (enables X11)
    - ajust time and start ntp date tracking:
    - - /etc/init.d/ntp stop
    - - ntpdate 0.debian.pool.ntp.org (synchronize computers time)
    - - /etc/init.d/ntp start
    - - see /etc/ntp.conf - to kill all XServer sessions: "sudo killall Xorg"
    - tunnel: ssh -f -L 10.193.129.2:9090:192.168.0.202:8080 192.168.50.99
    - sudo fuser -v 12001/tcp    (check which process/user is using a port)
    - nslookup: get the name from a remote machine from its IP
    - check if  a remote machine port is open:
    -- wget -qS -O- http://csimg.toto.me:80
    -- curl http://csimg.toto.me:80 
  • route (to see the gateway)
  • open ports
    • netstat -ltu   (local)
    • nmap (not showing all open ports, can be remote)
    • sudo nmap -sU  -p 500 192.168.1.254   (check internet key exchange port open, used for vpn)
  • most recent java files in the tree
    • find . |grep "\.java" |xargs ls -l|awk -F$' ' '{print $6 " " $7 " " $9}' |sort
  • swap per process
    • for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done
  • find
    • find . -name pom.xml | xargs grep 3.5 --color      (emacs/dired to replace)
    • find . -name '*.log.*' -delete 
    • find . | xargs wc (count the number of files in a folder and sub-directories)
  • grep
    • grep -o OTHERS tb-201610.sql |wc  (new line)
    • grep -P "\t8" file1.txt |awk -F$'\t' '{print $1}' |while read -r x; do grep "$x" file2.txt; done
    • look for multiple values
      • egrep "126|127|128|129|130|131|132|133" toto.txt
  • System 
    • sudo bash (remain as root)
    • uname -a
    • less /proc/meminfo
    • less /proc/cpuinfo
    • lspci -v | less (list all PCI devices connected to the PCI bus)
    • dmesg | less (kernel boot messages)
    • ls -1R | wc -l (number of files in a folder)
    • fdisk -l (disks on a machine)
    • dmidecode (physical processor info)
    • ps auxf  (process tree), 
    • ps -efj   (ppid)
    • ppid:  use htop F2
    • uptime   (when did the machine restarted)
  • DNS
    • dig cnn.com
  • load CPU:
  • for i in {1..1000000}; do gzip speetest;gunzip speetest.gz ;done
    .
    for f in $(find . -name "*.db" | grep 'rep_13' | grep run | xargs ls -1); do echo $f;done .
.

awk survival kit


look for a string in the 28th colon in a tsv:

awk -F$'\t' '{if($28 == "popcorn") print $0;}' toto.tsv
awk -F$'\t' '{print $18}' wm_marchand_Data.txt | sort -u
awk -F$'\t' '{print $2 "\t" $1}' sb.txt

for special caracters:
iconv -f ISO-8859-1 -t utf8 ri_thesaurus_v2_Data_LG161___.txt > ri_thesaurus_v2_Data_LG161____.txt
iconv -f utf8 -t ISO-8859-1//TRANSLIT lbm-ko-full-201806.csv > lbm-ko-full-201806-ISO8859.csv
  • keep numbers after = in the output
awk -F$' ' '{print $11}' noc.log  | sed -r 's/.*=([0-9]+)/\1/g'
  • keep only numbers
awk -F':' '{print $3}'  1300-siren.txt | sed -r 's/[^0-9]//g'
  • sort by full number
awk -F$' ' '{print $11}' noc.log  |sed -r 's/.*=([0-9]+)/\1/g' | sort -k1,1n > noc-out.log

  • grep -P "\t8" file1.txt |awk -F$'\t' '{print $1}' |while read -r x; do grep "$x" file2.txt; done
  • to find out swap
for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done| awk -F$' ' '{print $2}'  |sort  -k1,1n

.
.

Thursday, January 17, 2013

Elastic Search Survival Kit


  • Each Lucene segment has its own cache. So indexing is not affecting too much search performances
  • every node is a "master", everybody indexes and everybody searches
  • To use kibana  (http://127.0.0.1:5601/app/kibana#/dev_tools)
  • to be able to access kibana from remote, change kibana.yml:
    • server.host: "0.0.0.0"
PUT /fb
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "type": "custom",
          "tokenizer" : "whitespace",
          "filter": ["lowercase", "stop"]
        }
      }
    }
  },
  "mappings": {
    "pages": {
      "properties": {
        "about": {
          "type":     "text",
          "fielddata": true,
          "analyzer": "my_analyzer"
        }
      }
    }
  }
}

-----  term facets
GET /fb/pages/_search
{
  "size": 1,
  "aggs" : {
    "group_by_text_term" : {
      "terms" : {
        "field" : "about",
        "size":30
        }
    }
  }
}

----------- facets on a category (after v5, text columns are also keyword columns)
GET /fb2/pages/_search
{
  "size": 1,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "group_by_cat": {          
      "terms": {          
        "field": "category.keyword",
        "size": 10
      }
    }
  }
}

------------- delete all docs
POST /fb2/_delete_by_query
{
  "query": {
    "match_all": {}
  }
}

------------- to get more details of what is happening in a query
POST antoine-try-iso2/_search
{
  "profile": true,
  "_source": ["s_alpha.denom"],
  "query": {
[...]

------------- http Elastic queries

http://127.0.0.1:9200/_cat/indices

http://localhost:9200/_search?q=lastname:Bond 

http://127.0.0.1:9200/pj-search-index-1-6-test/_search


------------- search templates
GET _cluster/state/metadata?pretty&filter_path=**.stored_scripts

------------- versions
Elasticsearch 2.4.0 based on Lucene 5.5.2  (08/2016)
(ES version hop    2.4 to 5.0)
Elasticsearch 5.0                                          (11/2016)
Elasticsearch 5.5.1, based on Lucene 6.5.1 (07/2017)
Elasticsearch 5.6.3, based on Lucene 6.6.3 (07/2017)

Elasticsearch 6.0.0 beta (05/2017)   based on Lucene 7.0.0
Elasticsearch 6.7.1 beta (04/2019)   based on Lucene 7.7.1
Elasticsearch 7.0.0 beta (04/2019)   based on Lucene 8.0.0
In git sources:
vi  buildSrc/version.properties
elasticsearch     = 5.6.3
lucene            = 6.6.1

.

Tuesday, January 8, 2013

Hadoop / Cloudera survival kit

----- Debian
add this in /etc/apt/sources.list:

deb http://archive.cloudera.com/cdh4/debian/squeeze/amd64/cdh/ squeeze-cdh4.1.2 contrib

then you can do:

apt-get update
apt-get install hadoop

and then, things like:

hadoop fs -ls hdfs://192.168.0.135:8020/

----- Ubuntu:
add to   /etc/apt/sources.list

deb [arch=amd64] http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/ precise-cdh4 contrib
deb-src http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh precise-cdh4 contrib

curl -s http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/archive.key | sudo apt-key add -
apt-get update
sudo apt-get install hbase-master


[toto@vv182 ~]$  echo "scan 'offers', {LIMIT => 10, STARTROW => 'se|000029098138', ENDROW => 'se|0000291'}" |hbase shell

[toto@vv182 ~]$  echo "get 'offers','fr|000000002138|0000016418701245'" |hbase shell


[toto@vv182 ~]$ hadoop fs -cat /user/nomad/pipeline/delta_offers/my-file.txt