Since sometimes the training with CPU is low, I try to compile a tensorflow with gpu support. And I install it on a desktop and open
openssh-server since my laptop does not have Nvidia GPU.
- Fedora 25 Workstation with GUI.
- Kernel version
- GTX 950.
- We use
- Public IP address.
- Need admin permission.
We need compile many things from source code and it is tricky to deal with. Here is the relationship.
- compiles by gcc6
- Nvidia CUDA and cuDNN
- python and pip
- Bazel (compiled from source code)
openssh-server, type (Fedora may have it already)
sudo dnf install openssh-server
To see if it is running,
/sbin/service sshd status
systemctl start sshd.service systemctl enable sshd.service
In order to change the port, just
Port 22 and replace
22 with any port like
semanage port -a -t ssh_port_t -p tcp 12340 systemctl restart sshd.service
To test if it works ok,
ssh firstname.lastname@example.org -p 12340
Install Gcc 5
By default, Fedora
sudo dnf install gcc gcc-c++ kernel-devel kernel-headers
CUDA is not compatible with
gcc later than 5. Therefore, we need a
gcc5. You can get it in another way. Here I choose to compile it from source code. Note that you need
gcc5.4 since it seems there is a bug in
gcc5.3 when compiling
# Download wget http://ftp.gnu.org/gnu/gcc/gcc-5.4.0/gcc-5.4.0.tar.gz # unzip tar xvfj gcc-5.4.0.tar.gz cd gcc-5.4.0 # Download prerequisites ./contrib/download_prerequisites cd .. # build mkdir objdir cd objdir ../gcc-5.4.0/configure --with-system-lib --disable-multilib --enable-languages=c,c++ --prefix=/home/jasonqsy/gcc54 make -j4 make install
Note that you need set
prefix by hand since the default
/usr/local. To test it,
and it shows 5.4.0.
Personally, I use
pyenv to manage versions of
python and install
anaconda because it is designed for scientific computation. To install pyenv, follow pyenv or type as follows.
git clone https://github.com/pyenv/pyenv.git ~/.pyenv echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.zshrc echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.zshrc echo 'eval "$(pyenv init -)"' >> ~/.zshrc exec $SHELL
And we have
pyenv now. To install
pyenv install anaconda3-3.4.0 pyenv global anaconda3-3.4.0
To test the installation, type
conda list python --version
and should be 3.6. Moreover,
protobuf should be installed by
pip install protobuf
- CUDA 8.0
Download CUDA from https://developer.nvidia.com/cuda-downloads. Then
sudo rpm -i cuda-repo-fedora23-8-0-local-ga2-8.0.61-1.x86_64.rpm sudo dnf clean all sudo dnf install cuda
cuda will be installed at
/usr/local/cuda-8.0. We will need the directory when compiling
For cuDNN, register and download it from https://developer.nvidia.com/cudnn. Just extracting it is ok. I choose to put it at
Fedora does not have bazel binary support directly. Hence, we need compile from source. Here we use a trick. I download a compiled
bazel 0.4.2 to solve the dependencies including
bazel installer from https://github.com/bazelbuild/bazel/releases. Type
chmod +x bazel-version-installer-os.sh ./bazel-version-installer-os.sh
You can set your custom
—prefix ~. By default, it is
Make sure that your
gcc version is not later than 5.
CUDA has not supported
First, we need download the source code and configure.
git clone https://github.com/tensorflow/tensorflow cd tensorflow ./configure
./configure Please specify the location of python. [Default is /home/jasonqsy/.pyenv/shims/python]: Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: Do you wish to use jemalloc as the malloc implementation? [Y/n] n jemalloc disabled Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n No Google Cloud Platform support will be enabled for TensorFlow Do you wish to build TensorFlow with Hadoop File System support? [y/N] n No Hadoop File System support will be enabled for TensorFlow Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] n No XLA JIT support will be enabled for TensorFlow Found possible Python library paths: /home/jasonqsy/.pyenv/versions/anaconda3-4.3.0/lib/python3.6/site-packages Please input the desired Python library path to use. Default is [/home/jasonqsy/.pyenv/versions/anaconda3-4.3.0/lib/python3.6/site-packages] Using python library path: /home/jasonqsy/.pyenv/versions/anaconda3-4.3.0/lib/python3.6/site-packages Do you wish to build TensorFlow with OpenCL support? [y/N] n No OpenCL support will be enabled for TensorFlow Do you wish to build TensorFlow with CUDA support? [y/N] y CUDA support will be enabled for TensorFlow Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: /home/jasonqsy/gcc54/bin/gcc Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0 Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-8.0 Please specify the Cudnn version you want to use. [Leave empty to use system default]: Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-8.0]: /usr/local/cudnn Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: "3.5,5.2"]: 3.5,5.2 ........ INFO: All external dependencies fetched successfully. Configuration finished
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
My desktop used about 40 minutes to compile it.
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg pip install /tmp/tensorflow_pkg/tensorflow-1.0.1-cp36-cp36m-linux_x86_64.whl
We have finished the installation. But we still need some hacks. It seems there is a bug in
mv .pyenv/versions/anaconda3-4.3.0/lib/libstdc++.so.6 .pyenv/versions/anaconda3-4.3.0/lib/libstdc++.so.6.bak cp ~/gcc54/lib64/libstdc++.so.6 .pyenv/versions/anaconda3-4.3.0/lib/
import tensorflow as tf hello = tf.constant('Hello, TensorFlow!') sess = tf.Session() print(sess.run(hello))
2017-03-25 03:16:08.263158: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2017-03-25 03:16:08.263366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: name: GeForce GTX 950 major: 5 minor: 2 memoryClockRate (GHz) 1.3165 pciBusID 0000:01:00.0 Total memory: 1.95GiB Free memory: 1.84GiB 2017-03-25 03:16:08.263380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 2017-03-25 03:16:08.263384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y 2017-03-25 03:16:08.263930: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 950, pci bus id: 0000:01:00.0)
- Installing Bazel
- How to Install TensorFlow on Fedora with CUDA GPU acceleration
- Installing TensorFlow from Sources
- Installing GCC
- “'CXXABI_1.3.8' not found” in tensorflow-gpu - install from source
- NVIDIA CUDA Installation Guide for Linux