Conda management¶
Overview¶
One of the primary features of Anaconda for cluster management is the remote deployment and management of Anaconda environments across a cluster.
Prepending acluster
to core conda
commands will execute those commands
across all of the cluster nodes.
For example, to view the conda environments on all of the cluster nodes:
$ acluster conda info -e
All nodes (x3) response:
# conda environments:
#
root * /opt/anaconda
To install numpy on all of the cluster nodes:
$ acluster conda install numpy
Installing (u'numpy',) on cluster "demo_cluster"
Node "ip-10-234-8-208.ec2.internal":
Successful actions: 1/1
Node "ip-10-170-59-28.ec2.internal":
Successful actions: 1/1
Node "ip-10-232-42-58.ec2.internal":
Successful actions: 1/1
For a concise overview of all of the available commands, view the
Anaconda for cluster management Cheat Sheet
.
Installing packages from channels¶
You can install packages from Anaconda Cloud or from an
Anaconda Repository installation by adding the --channel/-c
option. For
example, to install the apache-libcloud package from the anaconda-cluster
channel:
$ acluster conda install -c https://conda.anaconda.org/anaconda-cluster apache-libcloud
Installing (u'apache-libcloud',) on cluster "demo_cluster"
Node "ip-10-136-80-92.ec2.internal":
Successful actions: 1/1
Node "ip-10-63-173-62.ec2.internal":
Successful actions: 1/1
$ acluster conda list | grep apache-libcloud
- 'apache-libcloud: 0.16.0 (py27_0)'
List of remote conda commands¶
The following conda commands are available in Anaconda for cluster management:
acluster conda install
– Install package(s)acluster conda update
– Update package(s)acluster conda remove
– Remove package(s)acluster conda list
– List package(s)acluster conda create
– Create conda environmentacluster conda info
– Display information about current conda installacluster conda push
– Push conda environment to cluster
For more information about conda, refer to the conda documentation.
Example remote conda commands¶
Install conda packages¶
Any conda package can be installed on a cluster managed by Anaconda for cluster management:
$ acluster conda install numpy
Node "ip-10-136-80-92.ec2.internal":
Successful actions: 1/1
Node "ip-10-63-173-62.ec2.internal":
Successful actions: 1/1
Verify that the package was installed using the list
command:
$ acluster conda list
All nodes (x2) response:
...
- 'numpy: 1.9.2 (py27_0)'
...
You may also install multiple conda packages with a single command:
$ acluster conda install scipy pandas scikit-learn
Node "ip-10-136-80-92.ec2.internal":
Successful actions: 3/3
Node "ip-10-63-173-62.ec2.internal":
Successful actions: 3/3
$ acluster conda list
All nodes (x2) response:
...
- 'pandas: 0.16.1 (np19py27_0)'
...
- 'scikit-learn: 0.16.1 (np19py27_0)'
- 'scipy: 0.15.1 (np19py27_0)'
...
Note: It is recommended that you install all of the packages you want at once. Installing one package at a time can result in dependency conflicts.
List conda packages¶
A useful conda command to run on a cluster is to list the packages that are available on the nodes (by default all of the nodes should have the same packages):
$ acluster conda list
All nodes (x2) response:
- 'libsodium: 0.4.5 (0)'
- 'sqlite: 3.8.4.1 (1)'
- 'conda-env: 2.1.4 (py27_0)'
- 'python: 2.7.9 (3)'
...
Update conda packages¶
You can also specify which versions of conda packages to install or update:
$ acluster conda install pandas==0.13
Installing (u'pandas==0.13',) on cluster "demo_cluster"
Node "ip-10-136-80-92.ec2.internal":
Successful actions: 1/1
Node "ip-10-63-173-62.ec2.internal":
Successful actions: 1/1
$ acluster conda list | grep pandas
- 'pandas: 0.13.0 (np18py27_0)'
$ acluster conda update pandas
Updating (u'pandas',) on cluster "demo_cluster"
Node "ip-10-136-80-92.ec2.internal":
Successful actions: 1/1
Node "ip-10-63-173-62.ec2.internal":
Successful actions: 1/1
$ acluster conda list | grep pandas
- 'pandas: 0.16.1 (np19py27_0)'
Remove conda packages¶
You can remove conda packages across a cluster:
$ acluster conda remove pandas
Removing (u'pandas',) on cluster "demo_cluster"
Node "ip-10-136-80-92.ec2.internal":
Successful actions: 1/1
Node "ip-10-63-173-62.ec2.internal":
Successful actions: 1/1
$ acluster conda list | grep pandas
... NO OUTPUT ...
Create conda environments¶
You can also manage conda environments across a cluster with Anaconda for cluster management.
To create a new conda environment that contains Python and numpy, use the
command conda create -n test_env numpy
.
On a cluster, use the same command and simply prepend acluster
as shown:
$ acluster conda create -n test_env numpy
All nodes (x2) response:
Conda environment "test_env" created
Once the environment is created, refer to that named environment by adding the
-n
name option to conda
commands:
$ acluster conda list -n test_env
All nodes (x2) response:
- 'sqlite: 3.8.4.1 (1)'
- 'python: 2.7.9 (3)'
- 'zlib: 1.2.8 (0)'
- 'openssl: 1.0.1k (1)'
- 'system: 5.8 (2)'
- 'tk: 8.5.18 (0)'
- 'setuptools: 15.2 (py27_0)'
- 'pip: 6.1.1 (py27_0)'
- 'readline: 6.2 (2)'
- 'numpy: 1.9.2 (py27_0)'
$ acluster conda install -n test_env requests
Installing (u'requests',) on cluster "d" - target: "*"
Node "ip-10-136-80-92.ec2.internal":
Successful actions: 1/1
Node "ip-10-63-173-62.ec2.internal":
Successful actions: 1/1
$ acluster conda list -n test_env
All nodes (x2) response:
- 'sqlite: 3.8.4.1 (1)'
- 'python: 2.7.9 (3)'
- 'zlib: 1.2.8 (0)'
- 'openssl: 1.0.1k (1)'
- 'system: 5.8 (2)'
- 'tk: 8.5.18 (0)'
- 'setuptools: 15.2 (py27_0)'
- 'pip: 6.1.1 (py27_0)'
- 'readline: 6.2 (2)'
- 'numpy: 1.9.2 (py27_0)'
- 'requests: 2.7.0 (py27_0)'
Push conda environments¶
You can also push conda environments from the client machine to the cluster by using a conda environment.yml
file:
$ acluster conda push ./environment.yml
['ip-10-234-8-208.ec2.internal'] nodes response:
Conda environment with "/tmp/anaconda-cluster/environment.yml" created
...
List conda environments¶
To verify that the environment has been pushed to all nodes, use the info
command:
$ acluster conda info -e
All nodes (x3) response:
# conda environments:
#
stats /opt/anaconda/envs/stats
root * /opt/anaconda