Notebook

You know the short problems that you keep solving and they keep coming back and you keep Googling? I just write them down in this notebook :) I share them here publicly in case somebody else could benefit...

Apache SSL setup

Get the certificates with certbot:
    certbot certonly --webroot -w /var/www/mydomain.com -d mydomain.com
Configure Apache to load the certificates:
  SSLEngine on
  SSLCertificateFile /etc/letsencrypt/live/mydomain.com/fullchain.pem
  SSLCertificateKeyFile /etc/letsencrypt/live/mydomain.com/privkey.pem
Voila :)

crontab

Super easy crontab setup, you just do:
crontab -e
and voila, enter the one liner for your command, perhaps use some help to determine when to run on crontab.guru, save the file, exit and you are all setup :-) Yay

multiprocessing and multithreading

The way to go running a function with different parameters with python in a multi-process way with multiprocess module. The Thread module often does not allow concurrent core usage.
max_cores = 10

def main():
    procs = []
    t = Process(target=function_to_run, args=[10, 20])
    procs.append(t)
    t = Process(target=function_to_run, args=[20, 30])
    procs.append(t)

# a better implementation would be with a queue -> TODO
while len(procs)>0:
    to_process = min(max_cores, len(procs))
    for i in range(to_process):
        procs[i].start()
    for i in range(to_process):
        procs[i].join()
    procs[:to_process] = [] # remove completed processes

def function_to_run(par1, par2):
    print par1 * par2

edgeR normalization (TMM)
  library("edgeR")
  library("data.table")

  input_fname = "expression_genes.tab"
  output_fname = "expression_genes_normalized.tab"

  gx <- fread(input_fname, colClasses=list(character=1:1))
  gxcounts <- gx[,-c(1:9),with=FALSE]

  group <- factor(c(rep(c(1), 30))) # doesn't influence normalization, cpm is only calculated on single replicate (library size and TMM), see d$samples
  d <- DGEList(counts=gxcounts, group=group)
  d <- calcNormFactors(d)

  d$genes <- gx[,1:9,with=FALSE]
  d$genes <- cbind(d$genes, round(cpm(d), 1))

  write.table(d$genes, file=output_fname, sep="\t", row.names=FALSE, quote=FALSE)

Cyberduck keeping file executable flag

If checked, Cyberduck had problems keeping the executable flag on files edited locally

java programs

To run a java program from another folder, simply specify the folder as class path:
java -cp /path/to/folder name_of_program

apache2 ProxyPass and preserve requesting host

Sometimes you forward some web traffic from one host to some other computer. But you would like to preserve the original host name in the forwarded request, mostly because then the apache server on the final host knows how to behave (ServerName, ServerAlias). Use this:
    ProxyPreserveHost On
and if you would like to set the host manually:
    RequestHeader set Host "myhost.edu"

GitHub config file

Add the [user] section to the GitHub config file (inside the .git folder) if you want your commits to be registered in the evidence of activity:
[user]
      name = your name
      email = your email

matplotlib tips & tricks

Adjust plot to fit figure:
plt.tight_layout()
Histogram with user defined bin borders:
n, bins, patches = P.hist([y1_list, y2_list...], bins=range(-30, 30+1)...

Kernel density

A pro-memoria of links connected with kernel density:

https://jakevdp.github.io/blog/2013/12/01/kernel-density-estimation
http://scikit-learn.org/stable/auto_examples/neighbors/plot_kde_1d.html

java programs

To run a java program from another folder, simply specify the folder as class path:
java -cp /path/to/folder name_of_program

Make Logitech RS 400 work perfectly also for Mac Keynote

The only problem is that the "Full screen" shortcut is F5. Simply add this key reassignment and it will work also in Keynote.



And voila, happy presenting.

bedGraph sum of counts [awk]

Sum absolute value of fourth column in a bedgraph:
cat file.bed | nawk '{ print ($4 >= 0) ? $4 : 0 - $4}' | awk '{ SUM += $1} END { print SUM }'
argparse [python]

Simple example for argparse:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-lib_id', action="store", dest="lib_id")
parser.add_argument('-poly_id', action="store", dest="poly_id")
parser.add_argument('-force', action="store_true", default=False)
args = parser.parse_args()
Usage: python script.py [-lib_id ...] [-poly_id ...] [-force]
Source

Mac: set Cmd-V to Paste and Match Style

You can modify the default OS X keyboard settings to change the keys assigned to "Paste" as the ones used for "Paste and Match Style" (hat tip @PenLlawen). Mac OS X allows you, in fact, to take an item from an application's menu (the one up in the menubar), and give it a specific keyboard shortcut. To do so, you have to copy the exact name of the item as you see it in the menubar, and give it a new shortcut in System Preferences -> Keyboard -> Keyboard Shortcuts -> Application Shortcuts for "All Applications". Whereas third-party apps may change the default shortcut for Paste and Match Style, they usually don't alter this action's name, so assigning a new keyboard combination to "Paste and Match Style" should work.
Source

How to call getattr() on the current module? [python]
import sys
m = sys.modules[__name__]
getattr(m, name)

Apache production settings
ServerSignature Off

Apache 2.2 to 2.4

The old "Order allow, deny" changed from:
Order allow,deny
Allow from all
to this:
Require all granted
2.2 doesn't work on 2.4, no notice or warning or error.

Python threading queue

When you would like to run several processes and wait for them to finish, no need to open 10 terminal windows in screen. Simply use Python threading and Queue:
from Queue import *
from threading import *

num_worker_threads = 10 # number of available workers (cores)
q = Queue()

def worker():
    while True:
        task = q.get()
        os.system(task)
        q.task_done() # important signal for q.join() to work

tasks = ["command1", "command2", "command3"...]

for i in range(num_worker_threads):
     t = Thread(target=worker)
     t.daemon = True
     t.start()

for task in tasks:
    q.put(task)

q.join()
Not only will this start your commands on num_worker_threads workers, but if you have more commands than available workers, they will start processing as soon as a worker is free...very handy indeed.

NFS setup on Ubuntu

Based on this guide, server side:
apt-get install nfs-kernel-server portmap
vim /etc/exports
and allow the export of (e.g. home folder) to specific hosts:
/home   192.168.1.10(rw,sync,no_root_squash,no_subtree_check)
Complete the server part by issuing:
exportfs -a
Now on the client:
apt-get install nfs-common portmap
mkdir /local/home
mount 192.168.0.1:/home /local/home
or edit /etc/fstab. Voila, you are done.

Mysql views and performance

Imagine a scenario with 2 Mysql tables:
data (id, data_name, data_type_id)
data_type (id, data_type_name)
You could have a view defined like this:
create data_view
select
data.*,
data_type.data_type_name as data_type_name
from data
left join data_type_name on data.data_type_id = data_type.id
Imagine now you have another Mysql table:
data_complex: id, data_id
You could simply create a view that joins the already existing data_view:
create data_complex_view
select
data_view.*,
data_complex.*
from data_complex
left join data_view on data_complex.data_id = data_view.id
but joining 2 views would be a very bad idea in terms of performance. Since views don�t have their own keys, even a very small view table (~1000 rows) could take a few seconds to load. Never join views in Mysql!
create data_comples_view
select
data_complex.*,
data.data_name as data_name,
data.data_type_id as data_type_id,
data_type.data_type_name as data_type_name
from data_complex
left join data on data_complex.data_id = data.id
left join data_type_name on  data_type_id = data_type.id
Voila, you have a fast view across multiple tables.

Ajax cross-domain scripting

If developing with Html5 and using jQuery, perhaps one way to solve the cross-domain access restrictions during development is to enable Apache headers connected with access-control:
a2enmod headers
and adding:
Header set Access-Control-Allow-Origin *
to the configuration files. Voila, you can access your scripts from anywhere using jquery.ajax for example. Source.

Mysql SSL, Python and Sqlalchemy

I needed to connect to a Mysql server from different sources (using python and sqlalchemy). One idea was to tunnel TCP 3306 with ssh or use stunnel. But in the end the really nice solution is to use Mysql SSL support. First create the certificates:
openssl genrsa 2048 > ca-key.pem
openssl req -new -x509 -nodes -days 1000 -key ca-key.pem > ca-cert.pem
openssl req -newkey rsa:2048 -days 1000 -nodes -keyout server-key.pem > server-req.pem
openssl x509 -req -in server-req.pem -days 1000 -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 > server-cert.pem
openssl req -newkey rsa:2048 -days 1000 -nodes -keyout server-key.pem > server-req.pem
openssl x509 -req -in server-req.pem -days 1000 -CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 > server-cert.pem
[server side] : we will put the required files in /etc/mysql/newcerts:
mkdir /etc/mysql/newcerts
cp ca-cert.pem /etc/mysql/newcerts
cp server-cert.pem /etc/mysql/newcerts
cp server-key.pem /etc/mysql/newcerts
Now we need to edit /etc/mysql/my.cnf. Add these lines:
ssl
ssl-ca=/etc/mysql/newcerts/ca-cert.pem
ssl-cert=/etc/mysql/newcerts/server-cert.pem
ssl-key=/etc/mysql/newcerts/server-key.pem
Note the first line "ssl" is not a mistake, it enables Mysql to use SSL. Since we put the files in /etc/mysql/newcerts, on Ubuntu we need to notify the apparmor daemon. Edit "/etc/apparmor.d/usr.sbin.mysqld" and add line:
/etc/mysql/newcerts/* r,
Finally, restart apache and apparmor:
service apparmor restart
service mysql restart
[client side]: copy these files in a folder on the client:
ca-cert.pem
client-cert.pem
client-key.pem
In sqlalchemy, when creating the engine, use this sintax:
# first create a dictionary with the paths to the required files
certs = {}
certs["cert"] = "client-cert.pem"
certs["key"] = "client-key.pem"
certs["ca"] = "ca-cert.pem"
# then create the engine
engine = create_engine('mysql://username:password@host/database', connect_args={'ssl':certs})
Voila, you are now connected to your server using SSL.

Running processes from python

To run a process from python and explore its stdout and stderr, you could use a class like this:
class Cmd():

    def __init__(self, command):
        self.command = command
        self.returncode = None
        self.process = subprocess.Popen(['/bin/bash', '-cl', command], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        self.pid = self.process.pid

    def run(self):
        output, error = self.process.communicate()
        self.returncode = self.process.returncode
        return output, error
An example usage would be:
p = Cmd("ls -l")
output, error = p.run()
And to check if a process with pid is still running? Something like this:
def process_isrunning(pid, os="linux"):
    if os=="linux":
        output, error = Cmd("ps -p %s" % pid).run()
        output = output.split("\n")
        return len(output)>=3
    if os=="windows":
        from win32com.client import GetObject
        WMI = GetObject('winmgmts:')
        processes = WMI.InstancesOf('Win32_Process')
        plist = [process.Properties_('ProcessID').Value for process in processes]
        return pid in plist
The process id is stored in Cmd.pid even before the process is run. Done.

Apache authentication with folder listing

First create new file containing username and password:
htpasswd -c filename username
Then allow .htaccess to override directory options:
<Directory /var/www/folder>
    Options Indexes FollowSymLinks
    AllowOverride all
    Order allow,deny
    allow from all
</Directory>
And lastly, put a .htaccess inside your folder:
AuthType Basic
AuthName "Name"
AuthBasicProvider file
AuthUserFile /path/to/pass_file
Require user username
IndexOptions IgnoreCase FancyIndexing FoldersFirst NameWidth=* DescriptionWidth=* SuppressHTMLPreamble
Voila.

.screenrc file

I always forget how i setup my .screenrc, so here it is once and for...:)
vbell off
shell -$SHELL
hardstatus alwayslastline
hardstatus string '%{gk}%{G}%H%{g}%= %{wk}%?%-Lw%?%{=b kW}%n*%f %t%?(%u)%?%{= kw}%?%+Lw%?%?%= %{g}%{Y}%l%{g}%{=b C} %c%{W}'
autodetach on # Autodetach session on hangup instead of terminating screen completely
startup_message off # Turn off the splash screen
defscrollback 30000 # Use a 30000-line scrollback buffer

Command prompt with path and bash, screen

If you start screen by default, it opens a new bash shell. But on Mac the .bashrc file is empty so there is no path in the prompt. Simply create a .bashrc file and enter:
export PS1="\h:\W \u\$ "
Now when you open screen, a new bash shell is run and the prompt is set from PS1 variable that was exported in .bashrc.

Login with ssh public/private key pair

ssh-add : this stores a keypair into the Mac keychain, even if you delete the key from the keychain, you need to restart the computer since the key is cached in memory and will still work if you used it before

the pair is stored in .ssh/id_rsa (private) and .ssh/id_rsa.public (public)

Server basic config and security

sudo apt-get install fail2ban
sudo ufw allow ssh
sudo ufw allow 80
sudo ufw enable

Mac to Unix line endings

Just use this perl one-liner to change line endings from Mac to Unix style:
perl -pe 's/\r\n|\n|\r/\n/g' inputfile > outputfile

Time machine backup on Truecrypt volume

Since Time machine doesn't show you mounted Truecrypt volumes, simply do:
mount
to find your volume (e.g. /Volumes/name), and then make Time machine use the volume for backups:
sudo tmutil setdestination /Volumes/name
Voila, you are doing the backup on your encrypted volume.

Sqlalchemy scoped_session

The workflow of scoped sessions in a multi-threaded web-app could look like this:
engine = create_engine('mysql://user:pass@host/database')
Session = scoped_session(sessionmaker(bind=engine))

conn = Session()
conn.query(Table)
...
conn.commit()

# and at the end
Session.remove()
And in a wsgi application Class, you could use something like this:
class MyApplication():

    def __init__(self, environ, start_response):
        engine = create_engine('mysql://user:pass@host/database')
        Session = scoped_session(sessionmaker(bind=engine))

    def close(self):
        Session.remove()

application = MyApplication

Apache server side includes

To enable server side includes (either on Apache HTTP or HTTPS), first enable mod_includes:
    sudo a2enmod include
and then add the following:
<Directory "/var/www">
  Options +Includes
  XBitHack On
</Directory>
Then chmod +x your include files, and inside the html:
<!--#include file="utils.hidden" -->

Last updated: 20171222; Comment platform by Disqus