(后面内容是本人初次玩GPU时,遇到很多坑的问题总结及尝试解决办法。由于买独立的GPU安装会涉及到设备的兼容问题,这里建议还是购买GPU一体机(比如/3964771.html),几行代码就可以顺利安装。---.10.04)
电脑配置 Ubuntu 14.04(64位)+GeForce GTX970:
选择安装系统Ubuntu14.04:
可能电脑配置的不同,在我的机子上这里尝试安装Fedora23,Fedora24,Fedora25,Ubuntu 16.04均出现不同程度的问题,问题总结如下:
1. 高版本系统,安装GTX970的接口没有信号输出,进入安装界面直接黑屏,显示屏提示没有输入信号,如Ubuntu 16.04,Fedora25,但windows系统能够正常启动。
2. fedora低版本(fedora23)安装,开始有信号,然后进入安装时黑屏,或者直接没有信号(Fedora25)。
3. fedora23安装完成后,系统卡死,重启无法登陆。
然后尝试不同的方法解决,比如认为那些黑屏的原因,是由于系统没有带NVIDIA驱动,就现在电脑自带的集成显卡下先安装好驱动,然后再接入GTX970的独立显卡,方法能够让屏幕从无信号转变为有信号黑屏。
从上面的不断装系统尝试,得到几点推断:
1. Windows系统能够正常启动,说明不是GPU的原因,而是系统原因
2. 高版本的Fedora 25和Ubuntu16.04不能够让GTX970产生信号,当转换到低版本的Fedora时有信号,而且低版本的,这让我们推断是否可以尝试低版本的Fedora和Ubuntu
3. fedora23-25在GTX970上直接没有引导信号,之后在安装驱动后才有信号,但Ubuntu16.04在安装时开始信号正常,只是安装时黑屏,说明低版本的Ubuntu可能更好
因此,这里最后尝试了低版本Ubuntu14.04,安装时正如推断的一样,安装过程十分正常。
Ubuntu 下载/download/alternative-downloads
系统的前提安装
安装成功Ubuntu14.04后,由于该系统版本较低,缺少一些必要的功能,如Ctrl+Alt+F1(F2-F6)不能进入字符界面tty,但这是cuda安装必要的界面。另外,缺少gcc,g++等编译环境,同时需要对apt-get进行更新,因此建议安装如下内容:
sudo sed -i -e 's/#GRUB_TERMINAL/GRUB_TERMINAL/g' /etc/default/grubsudo update-grubsudo apt-get install nautilus-open-terminalsudo apt-get updatesudo apt-get install gcc g++ linux-headers-$(uname -r)sudo apt-get install vimsudo apt-get install python-pip python-dev
安装完之后需要重启reboot,才能进入正常字符界面。
成功安装Ubuntu14.04系统,安装cuda 8.0
在安装cuda8.0,也是费了一番周折,尝试如下:
1. 先在NVIDIA官网下载GTX970的显卡驱动安装,
安装时完全按照要求,包括先卸载已有的NVIDIA驱动,停用lightdm。
这里也尝试NVIDIA多次,遇到各种提示错误。最后安装成功后,GPU异常,风扇疯狂旋转,重启无法进入Ubuntu登录界面,并黑屏。
2. 安装低版本NVIDIA驱动,如NVIDIA-3.4等,再安装CUDA8.0,并在安装时尝试CUDA8.0时选择是否安装NVIDIA Accelerated Graphics Driver,是否安装openGL libraries,均进行了yes或no的尝试,虽然也能顺利安装,但重启电脑无法进入登录界面,屏幕直接黑屏。
3. 由于CUDA8.0自带NVIDIA驱动,在安装时也尝试安装只安装CUDA8.0,而没有在之前另外安装GTX970的驱动。尝试多次,仍然失败。其中一次,安装后在tty界面上可以正常编译和执行NVIDIA_CUDA-8.0_Samples中的例子,但切换到lightdm图形界面,出现黑屏并且用Ctrl+Alt+F1-F6和重启均不能切换回tty界面。
下载cuda8.0驱动
在NVIDIA的CUDA官网下载驱动,如我这里下载Ubuntu14.04对应的CUDA8.0的runfile文件NVIDIA-Linux-x86_64-375.26.run
按照官方文档安装CUDA8.0
Verify You Have a CUDA-Capable GPU:
hd@hd:~$ lspci | grep -i nvidia01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1)01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio Controller (rev a1)
网上有各种安装cuda驱动的教程,可能是各自适用的情况不同,这里强烈建议按照官方文档Installation Guide for Linux进行安装
主要步骤如下:
1. Perform the pre-installation actions.2. Disable the Nouveau drivers.3. Reboot into text mode (runlevel 3).4. Verify that the Nouveau drivers are not loaded.5. Run the installer and follow the on-screen prompts:$ sudo bash cuda_8.0.44_linux.run
禁用驱动nouveau
为了安装Display Driver,需要禁用驱动nouveau,方法如下:
Ubuntu中新建/etc/modprobe.d/blacklist-nouveau.conf文件
sudo vim /etc/modprobe.d/blacklist-nouveau.conf
加入
blacklist nouveauoptions nouveau modeset=0
保存退出,Regenerate the kernel initramfs执行
sudo update-initramfs -u
使其生效。查看是否禁用成功:
lsmod | grep nouveau
若没有信息,即表明禁用成功。
禁用图像界面
sudo service lightdm stop
Ubuntu14.04以level 3启动设置:
Ubuntu默认是采用level 2重启,以图像界面进入,但图像界面是需要在安装cuda驱动之前禁用的,需要登录是直接是以字符界面。
需要采用level 3重启,即重启后只进入字符编辑界面tty,这种方法也可以避免重启后直接进入图形界面因黑屏而束手无策。
这里尝试很多方法,有效的方法如下:
首先建立文件/etc/default/grub
sudo vi /etc/default/grub
修改GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash”为:
GRUB_CMDLINE_LINUX_DEFAULT=” text”
然后运行下sudo update-grub,开机首先进入可启动字符界面
/jk110333/article/details/17878843
/questions/615634/how-to-set-default-runlevel
添加cudnn加速包
下载cudnn加速包,解压将其中文件复制到cuda的安装的对应路径下:
tar xvzf cudnn-8.0-linux-x64-v5.1.tgz #这里要注意你下载的版本,需要解压你下载的对应版本的文件#解压后的文件夹名字是cudasudo cp cuda/include/cudnn.h /usr/local/cuda/includesudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
添加cuda路径:
新建.bash_profile,
sudo gedit ~/.bash_profile
在文件中加入
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64"export CUDA_HOME=/usr/local/cuda-8.0
关闭,然后对环境变量进行更新
source ~/.bash_profile
or Package Manager Installation
hd@hd:~$ sudo gedit ~/.profileexport PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}hd@hd:~$ source ~/.profile
hd@hd:~$ cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.57 Mon Oct 3 20:37:01 PDT GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
cd /usr/local/cuda/samples/5_Simulations/smokeParticlessudo make./smokeParticles
安装Tensorflow-GPU版本
Pip installation
# Ubuntu/Linux 64-bit$ sudo apt-get install python-pip python-dev
选择GPU版本Tensorflow并安装
# Ubuntu/Linux 64-bit, GPU enabled, Python 2.7# Requires CUDA toolkit 8.0 and CuDNN v5. For other versions, see "Installing from sources" below.$ export TF_BINARY_URL=/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl# Python 2$ sudo pip install --upgrade $TF_BINARY_URL
注意:安装是pip可能会出错,采用如下方法对pip进行更新
$ python -m pip install --upgrade pip
测试Tensorflow的GPU运行效果:
# Using 'python -m' to find the program in the python search path:$ python -m tensorflow.models.image.mnist.convolutionalExtracting data/train-images-idx3-ubyte.gzExtracting data/train-labels-idx1-ubyte.gzExtracting data/t10k-images-idx3-ubyte.gzExtracting data/t10k-labels-idx1-ubyte.gz...etc...
error:
Initialized!F tensorflow/stream_executor/cuda/:221] Check failed: s.ok() could not find cudnnCreate in cudnn DSO; dlerror: /usr/local/lib/python2.7/dist-packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cudnnCreate
解决办法,添加环境变量
export PATH=/usr/local/cuda/bin:$PATHexport LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"export CUDA_HOME=/usr/local/cuda
CUDA教程有4.4. Device Node Verification,尝试了没有什么卵用
#!/bin/bash/sbin/modprobe nvidiaif [ "$?" -eq 0 ]; then# Count the number of NVIDIA controllers found.NVDEVS=`lspci | grep -i NVIDIA`N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`N=`expr $N3D + $NVGA - 1`for i in `seq 0 $N`; domknod -m 666 /dev/nvidia$i c 195 $idonemknod -m 666 /dev/nvidiactl c 195 255elseexit 1fi/sbin/modprobe nvidia-uvmif [ "$?" -eq 0 ]; then# Find out the major device number used by the nvidia-uvm driverD=`grep nvidia-uvm /proc/devices | awk '{print $1}'`mknod -m 666 /dev/nvidia-uvm c $D 0elseexit 1fi
received a notification in the final install message about thesemissing library files:
http://kmdouglass.github.io/stories/notes/cuda.html
Missing recommended library: libX11.soMissing recommended library: libXi.soMissing recommended library: libXmu.soMissing recommended library: libGL.so
有建议创建链接
To fix this, create symlinks in/usr/lib/
to the corresponding files:
sudo ln -s x86_64-linux-gnu/libX11.so libX11.sosudo ln -s x86_64-linux-gnu/libXi.so libXi.sosudo ln -s x86_64-linux-gnu/libXmu.so libXmu.sosudo ln -s x86_64-linux-gnu/libGL.so libGL.so
有建议安装
sudo apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev
均尝试也不知道有没有效果
Ubuntu 14.04 安装 CUDA 问题及解决
/gaowengang/p/6068788.html
一般我们可以安装最高版本的专有显卡驱动。也可以在终端里输入下面的命令来查看哪一个专有驱动是推荐安装的。
sudo ubuntu-drivers devices
/linux/28743.html
修复 unity 在删除了所有提示安装不正确的包后,开始修复unity: 输入命令: sudo apt-get install unity --fix-missing
双显卡处理办法:
/gaowengang/p/6068788.html
/desktop-linux/switch-intel-nvidia-graphics-card-ubuntu http://slaytanic./2057708/1630597/ 参考了网上这么多关于双显卡的方法,在我的电脑均不行, Ubuntu 16.04下CUDA8环境配置的2种方法//02/26/ubuntu-cuda8-env-set/ blos设置,禁用集显,/Win10xy/Win10yh_522.html .cn/viewtopic.php?t=476731 Intel Integrated Graphics, dedicated GPU for CUDA and Ubuntu 13.10 and 14.04 http://osdf.github.io/blog/intel-integrated-graphics-dedicated-gpu-for-cuda-and-ubuntu-1310.html/html/2150.html
双显卡解决:安装bumblebee
sudo add-apt-repository ppa:bumblebee/stable
sudo apt-get update
sudo apt-get install bumblebee bumblebee-nvidia
尝试之后虽然可以进入Desktop界面,但NVIDIA驱动和CUDA不能使用,因此这种方法有问题
我的电脑配置:
Graphic: Gallium 0.4 on llvmpipe (LLVM 3.8, 128 bits)
Software & Updates的Additional Drivers界面
You can update your system with unsupported packages from this untrusted PPA by addingppa:xorg-edgers/ppato your system's Software Sources. (Read about installing)
sudo add-apt-repository ppa:xorg-edgers/ppasudo apt-get update
错误/xia-Autumn/p/6228911.html
1.libcudart.so.8.0: cannot open shared object file: No such file or directory
======================================================================================
【如果每次开启都显示此错误,则需要打开变量文件设置变量】
打开终端并输入:
sudo gedit ~/.bashrc。
输入用户密码。这时输入的密码是不可见的。
前面的步骤会打开.bashrc文件,在其末尾添加:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"export CUDA_HOME=/usr/local/cuda
使其立即生效,在终端执行:
source ~/.bashrc
或者重启电脑即可。
给驱动run文件赋予执行权限/u012759136/article/details/53355781
sudo chmod a+x NVIDIA-Linux-x86_64-375.20.run
安装(注意 参数)
–no-x-check 安装驱动时关闭X服务–no-nouveau-check 安装驱动时禁用nouveau–no-opengl-files 只安装驱动文件,不安装OpenGL文件sudo ./NVIDIA-Linux-x86_64-375.20.run –no-x-check –no-nouveau-check –no-opengl-files
重启,并不会出现循环登录的问题
所以要手动安装必要的 lib,如下,
$ sudo apt-get install freeglut3-dev
$ sudo apt-get install libxmu-dev
Error appears when NVIDIA-X-server-settings:
hd@hd:~$ vim /etc/default/grub hd@hd:~$ sudo update-grub
addnomodeset nogpumanager in grub file:
# If you change this file, run 'update-grub' afterwards to update# /boot/grub/grub.cfg.# For full documentation of the options in this file, see:# info -f grub -n 'Simple configuration'GRUB_DEFAULT=0GRUB_HIDDEN_TIMEOUT=0GRUB_HIDDEN_TIMEOUT_QUIET=trueGRUB_TIMEOUT=10GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nomodeset nogpumanager"GRUB_CMDLINE_LINUX=""# Uncomment to enable BadRAM filtering, modify to suit your needs# This works with Linux (no patch required) and with any kernel that obtains# the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)#GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"# Uncomment to disable graphical terminal (grub-pc only)#GRUB_TERMINAL=console# The resolution used on graphical terminal# note that you can use only modes which your graphic card supports via VBE# you can see them in real GRUB with the command `vbeinfo'#GRUB_GFXMODE=640x480# Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux#GRUB_DISABLE_LINUX_UUID=true# Uncomment to disable generation of recovery mode menu entries#GRUB_DISABLE_RECOVERY="true"# Uncomment to get a beep at grub start#GRUB_INIT_TUNE="480 440 1"
sudo cp cuda_8.0.44_linux.run ~/cuda_8.0.44_linux.runsudo apt-get updatesudo apt-get install vimlspci | grep -i nvidiasudo apt-get install linux-headers-$(uname -r)sudo apt-get install python-pip python-devsystemctl set-default multi-user.targetrebootsudo vim /etc/X11/xorg.confexport LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"sudo apt install gitgit clone /tensorflow/tensorflow.gitpython -c "import tensorflow"
hd@hd:~$ python -c "import tensorflow"I tensorflow/stream_executor/:135] successfully opened CUDA library libcublas.so.8.0 locallyI tensorflow/stream_executor/:135] successfully opened CUDA library libcudnn.so.5 locallyI tensorflow/stream_executor/:135] successfully opened CUDA library libcufft.so.8.0 locallyI tensorflow/stream_executor/:135] successfully opened CUDA library libcuda.so.1 locallyI tensorflow/stream_executor/:135] successfully opened CUDA library libcurand.so.8.0 locallyhd@hd:~$
sudo service lightdm stopsudo chmod a+x cuda_8.0.44_linux.runsudo ./cuda_8.0.44_linux.run --no-opengl-libssudo nvidia-xconfigsudo vim /etc/X11/xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig# nvidia-xconfig: version 375.39 (buildmeister@swio-display-x86-rhel47-09) Tue Jan 31 20:47:44 PST Section "ServerLayout"Identifier"Default Layout"Screen "Default Screen" 0 0InputDevice "Keyboard0" "CoreKeyboard"InputDevice "Mouse0" "CorePointer"EndSectionSection "InputDevice"# generated from defaultIdentifier"Keyboard0"Driver "keyboard"EndSectionSection "InputDevice"# generated from defaultIdentifier"Mouse0"Driver "mouse"Option "Protocol" "auto"Option "Device" "/dev/psaux"Option "Emulate3Buttons" "no"Option "ZAxisMapping" "4 5"EndSectionSection "Monitor"Identifier"Monitor0"VendorName"Unknown"ModelName"Unknown"HorizSync 28.0 - 33.0VertRefresh43.0 - 72.0Option "DPMS"EndSectionSection "Device"Identifier"intel"Driver "intel"BusID "PCI:0@0:2:0"Option"AccelMethod" "SNA"EndSectionSection "Screen"Identifier"Default Screen"Device "intel"Monitor "Monitor0"DefaultDepth 24Option "AccelMethod" "SNA"SubSection"Display"Depth 24Modes"nvidia-auto-select"EndSubSectionEndSection
/gaowengang/p/6068788.html
4 本机使用 intel 集显作为 display card,而 nvidia 独显只作为 CUDA computing card,建立或修改 /etc/X11/xorg.conf 文件,内容如下,
Section "Device"
Identifier "intel"
Driver "intel"
BusID "PCI:0@0:2:0" (使用 lspci | grep -i intel 查询即可)
Option "AccelMethod" "SNA"
EndSection
为防止系统自动修改此文件,打开文件 /etc/default/grub, 在 GRUB_CMDLINE_LINUX_DEFAULT 中增加选项 "nogpumanager",之后更新 grub 即可,
$ sudo update-grub
注意: 由于安装时指定了--no-opengl-libs所以安装完成后会 warnings 如下,
Missing recommended library: libGLU.soMissing recommended library: libXi.soMissing recommended library: libXmu.so
所以要手动安装必要的 lib,如下,
$ sudo apt-get install freeglut3-dev
$ sudo apt-get install libxmu-dev
Tensorflow(Old version):
/tensorflow/models
Tutorial:
/versions/master/get_started/
You do not appear to be using the NVIDIA X driver. Please edit your X configuration file (just run `nvidia-xconfig` as root), and restart the X server.
Modes"1024×768" 是分辨设置。自行添加进去之后,重启即可。以下是部分内容Section "Monitor"Identifier"Monitor0"VendorName"Unknown"ModelName"Unknown"HorizSync 31.5 - 61.0%这个地方修改VertRefresh50.0 - 75.0%这个地方修改Option "DPMS"EndSectionSection "Device"Identifier"Device0"Driver "nvidia"VendorName"NVIDIA Corporation"EndSectionSection "Screen"Identifier"Screen0"Device "Device0"Monitor "Monitor0"DefaultDepth 24SubSection"Display"Depth 24Modes"1024×768" % 原先没有这一句添加进去的EndSubSectionEndSection
三 intel和NVIDIA双显卡
基本参考:/config/ubuntu-nvidia-prime.html
注意的是 我的更新源主服务器和中国服务器 附加驱动里都找不到NVIDIA的私有驱动。
所以我参考了其他资料后,查到NVIDIA官网/drivers 最新linux-64位的驱动 是340版本
但执行sudo apt-get install nvidia-340 nvidia-settings-340 nvidia-prime
提示我无340nvidia相关包,然后我就改成331 提示无nvidia-settings-331包 既然是个settings工具 我就先不装。
执行 sudo apt-get install nvidia-331 nvidia-prime 成功
然后安装完毕,还要安装一个 Nvidia Prime 双显卡切换指示器,用于在系统托盘中轻点鼠标即可切换显卡,而不必使用命令。在终端中使用如下 PPA 安装:
sudo add-apt-repository ppa:nilarimogard/webupd8
sudo apt-get update
sudo apt-get install prime-indicator
重启后 就看到显卡切换图标了。
在装个命令工具测试fps
须要安装mesa-utils:sudo apt-get install mesa-utils
测试指令:glxgears
NVIDIA / Intel 核芯显卡显示 + Nvidia 计算
/platero/p/4746285.html
sudo vim ~/.bashrc
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"export CUDA_HOME=/usr/local/cudaexport PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}export LD_LIBRARY_PATH=/usr/lib/nvidia-375
Ubuntu 14.04(64位)+GTX970+CUDA8.0+Tensorflow配置 (双显卡NVIDIA+Intel集成显卡) ------本内容是长时间的积累 有时间再详细整理...