Building and running code: Paths and Environment Variables

Some classes (notably the Ling570 series) require you to write code that will be run by a TA on Patas (and Condor). You may find that code that ran fine for you fails to run for the TA. This wiki aims to clear up some common problems.

Paths

As your projects become more complex, you will inevitably need to reference code or resources in other files. Here are some common issues regarding paths:

Accessing /dropbox in a Condor job:

~/dropbox is a symlink to /opt/dropbox on Patas, but Condor does not recognize this. Use the full /opt/dropbox path instead.

Getting files relative to an executed script:

Say you have a structure like this:

~/project$ ls
data run.sh
~/project$ ls data/
datafile
~/project$ cat run.sh
#!/bin/bash
cat data/datafile

Running run.sh will print the contents of data/datafile if the user is currently in the ~/project directory. Otherwise it will fail:

~/project$ ./run.sh
xyz789
~/project$ cd data/
~/project/data$ ../run.sh
cat: data/datafile: No such file or directory

You can help ensure run.sh finds the file relative to itself by getting the path of the script itself:

~/project$ cat run.sh
#!/bin/bash
project_home="$( cd "$( dirname "$0" )" && pwd )"
cat $project_home/data/datafile
~/project$ ./run.sh
xyz789
~/project$ cd data/
~/project/data$ ../run.sh
xyz789

The $0 variable is the command called (e.g. "./run.sh"), so the second line in the script sets project_home to the path where the script resides. It works most of the time, and is convenient, but a more reliable way is to use a config file with the path, or pass the path as an argument:

~/project$ cat run.sh
#!/bin/bash
project_home="$1"
cat $project_home/data/datafile
~/project$ ./run.sh ~/project/
xyz789
~/project$ cd data/
~/project/data$ ../run.sh ~/project
xyz789

Maintaining path variables in subsequent script calls:

Above we set project_home to a path so we can access files in that directory structure. If you called a script from run.sh, that variable would be gone. You can export it to make it accessible for any script called by run.sh. First, here's the new directory structure:

~/project$ ls
data run.sh src
~/project$ ls src/
printdata.sh
~/project$ cat run.sh
#!/bin/bash
project_home="$1"
$project_home/src/printdata.sh
~/project$ cat src/printdata.sh
#!/bin/bash
cat $project_home/data/datafile

If we run run.sh, src/printdata.sh will fail:

~/project$ ./run.sh ~/project
cat: /data/datafile: No such file or directory

So we export the variable, and it works:

~/project$ cat run.sh
#!/bin/bash
export project_home="$1"
$project_home/src/printdata.sh
~/project$ ./run.sh ~/project/
xyz789

Of course, you could just pass $project_home as an argument to src/printdata.sh, but as more arguments are needed it becomes unwieldy.

Path modification

Sometimes you may be tempted to modify given paths. For instance, let's say you have a script, depunc.sh, that removes punctuation like ?!,.". You give it the path to an file, and you want the output to go to a file of the same name but in an output/ directory. You may do something like this:

#!/bin/bash
textfile="$1"
sed 's/[\?.!,"]//g' < $textfile > output/$textfile

This will fail in many cases if the input path is more than just a filename. For instance, if we give it "data/myfile", it will fail if the output/ directory does not already contain a data/ subdirectory. If we give it "../myfile", then it may succeed, but the file will end up in the directory above output/.

You may also try to modify the end of the path:

[...]
sed 's/[\?.!,"]//g' < $textfile > $textfile.nopunc

This will fail if you don't have write access to the directory where $textfile resides.

The best solution is to not try anything complicated. You can either give an output filename as an argument:

[...]
outfile="$2"
sed 's/[\?.!,"]//g' < $textfile > $outfile

or just print the output to STDOUT (i.e. don't do anything inside the script), and handle the redirection outside the script:

[...]
sed 's/[\?.!,"]//g' < $textfile

Environment Variables

Above, we exported a variable so it would be accessible to subsequent script calls. That was an example of creating an environment variable. There are a number of standard environment variables, such as PATH, HOME, USER, HOSTNAME, etc. PATH has a number of paths (delimited by colons) where executables (such as `ls`, `cat`, `vim`, `scp`, etc) are located. You can modify PATH, but if you clear it you will have trouble doing much of anything. Some programs have their own standard environment variables. Java has JAVAHOME and CLASSPATH, and Python has PYTHONHOME and PYTHONPATH.

You can see what environment variables are set with the `env` command:

~$ env
HOSTNAME=patas.ling.washington.edu
[...]
G_BROKEN_FILENAMES=1
_=/bin/env

If you only want to see the value of one variable, you can echo it:

~$ echo "$HOSTNAME"
patas.ling.washington.edu

`unset` is used to clear a variable (be careful, it completely clears the variable, rather than clearing only local changes):

~$ echo "$CLASSPATH"
:/opt/mysql-connector-java/mysql-connector-java-5.1.6-bin.jar:/opt/GNUstep/System/Library/Libraries/Java:/opt/GNUstep/Local/Library/Libraries/Java:/home2/goodmami/GNUstep/Library/Libraries/Java
~$ unset CLASSPATH
~$ echo "$CLASSPATH"

If you modify an environment variable, it is valid within the current process. For instance, if you change it at the command line, it will be valid for any further commands you execute, but it won't be valid for other sessions (for instance if you log in to patas.ling.washington.edu in two or more simultaneous sessions). If you modify it within a script, it will be valid in that script and subsequent calls from it.

Environment variables are usually modified for user sessions in a bashrc file. The usual locations are ~/.bashrc and ~/.bash_profile, which are called when a new bash session is started. Please note that any modifications to variables here will not necessarily be the same as what your classmates or TA have, so don't rely on them in your programs.

-- Main.goodmami - 2012-04-25

Topic revision: r5 - 2012-04-27 - 16:09:36 - brodbd
 

This site is powered by the TWiki collaboration platformCopyright & by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
Privacy Statement Terms & Conditions