Cloudera Manager ParcelsΒΆ
Anaconda Repository provides a way to integrate with Cloudera Manager to distribute your Anaconda data science artifacts to your Hadoop cluster. You can create custom parcels with the packages you want, including your own packages.
NOTE: Creating custom parcels requires a local mirror of the Anaconda packages.
Anaconda Repository will not connect to http://repo.continuum.io
to fetch
packages that are not available locally. See the
mirroring documentation.
To create a custom parcel, navigate to /<username>/installers
. You can
find this link in the dropdown menu, or in the Installers card on your
homepage.
Now select the “Create new installer” button. This brings you to a package selection form.
You can select which accounts to fetch packages from - the anaconda
user will be added
by default. After an account is selected, package names can be entered into the package
field.
When creating a parcel, Anaconda Repository
generates a file named construct.yaml
which can be used with
conda constructor, and a 64-bit Linux
installer including the specified packages. To create just the installer script,
click Create installer; to create a parcel, click Create parcel.
NOTE: By default, conda is not included in the custom parcel. If you wish to
add additional packages to your environment, you can add those through the
Anaconda Repository interface. If you wish to see the list of packages that
are included in your custom parcel, the information is provided in
/opt/cloudera/parcels/<PARCEL_NAME>/meta/parcel.json
.
NOTE: The parcel is generated with the prefix of /opt/cloudera/parcels/<PARCEL_NAME>
.
This is the default location where activated parcels are loaded. If you are deploying parcels
in a different directory, you can change this prefix with the PARCELS_ROOT
configuration setting.
Once you have created a custom parcel, you can distribute it to your
cluster by adding http://<repository ip>:<port>/<username>/installers/parcels/
as a
Remote Parcel Repository URL.
Cloudera Manager will detect the parcels hosted on Anaconda Repository, and provide the
option to Download and Distribute the parcels.
By default, Anaconda Repository generates a parcel file for every
compatible distribution.
You can customize which parcel distributions are created by configuring the PARCEL_DISTRO_SUFFIXES
configuration setting.
NOTE: If you have configured conda
via ~/.condarc
on your server for use of a proxy (for example, to mirror behind a proxy) you must
disable proxying for the repository. See the
conda documentation
for more information.
For example:
proxy_servers:
https: http://proxy.corp.example.com
http: http://proxy.corp.example.com
'http://<repository ip>': false