Skip to content

Commit 258229f

Browse files
committed
Update patches and README
1 parent 4fe6688 commit 258229f

File tree

2 files changed

+109
-5
lines changed

2 files changed

+109
-5
lines changed

README.md

Lines changed: 47 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,50 @@
11
# installTensorFlowTX2
2-
Install TensorFlow on the NVIDIA Jetson TX2 Development Kit
2+
April 1, 2017
3+
Install TensorFlow v1.0.1 on NVIDIA Jetson TX2 Development Kit
34

4-
Work in progress
5+
Jetson TX2 is flashed with JetPack 3.0 which installs:
6+
* L4T 27.1 an Ubuntu 16.04 64-bit variant (aarch64)
7+
* CUDA 8.0
8+
* cuDNN 5.1.10
9+
10+
### Installation
11+
Before installing TensorFlow, a swap file should be created (minimum of 8GB recommended). The Jetson TX2 does not have enough physical memory to compile TensorFlow. The swap file may be located on the internal eMMC, and may be removed after the build.
12+
13+
Note: This procedure was derived from these discussion threads:
14+
15+
<ul>
16+
<li>https://github.com/tensorflow/tensorflow/issues/851</li>
17+
<li>http://stackoverflow.com/questions/39783919/tensorflow-on-nvidia-tx1/</li>
18+
<li>https://devtalk.nvidia.com/default/topic/1000717/tensorflow-on-jetson-tx2/</li>
19+
</ul>
20+
21+
TensorFlow should be built in the following order:
22+
23+
#### installPrerequisites.sh
24+
Installs Java and other dependencies needed. Also builds:
25+
26+
##### Bazel
27+
Builds version 0.4.5. Includes patches for compiling under aarch64.
28+
29+
#### cloneTensorFlow.sh
30+
Git clones v1.0.1 from the TensorFlow repository and patches the source code for aarch64
31+
32+
#### setTensorFlowEV.sh
33+
Sets up the TensorFlow environment variables. This script will ask for the default python library path. There are many settings to chose from, the script picks the usual suspects. Uses python 2.7.
34+
35+
#### buildTensorFlow.sh
36+
Builds TensorFlow.
37+
38+
#### packageTensorFlow.sh
39+
Once TensorFlow has finished building, this script may be used to create a 'wheel' file, a package for installing with Python. The wheel file will be in the $HOME directory.
40+
41+
#### Install wheel file
42+
$ pip install $HOME/<em>wheel file</em>
43+
44+
45+
#### Build Issues
46+
47+
For various reasons, the build may fail. The 'debug' folder contains a version of the buildTensorFlow.sh script which is more verbose in the way that it describes both what it is doing and errors it encounters. See the debug directory for more details.
48+
49+
#### Notes
550

patches/tensorflow.patch

Lines changed: 62 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,66 @@
1-
diff --git tensorflow/stream_executor/cuda/cuda_gpu_executor.cc tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
1+
diff --git a/tensorflow/core/kernels/BUILD b/tensorflow/core/kernels/BUILD
2+
index 2e04827..9d81923 100644
3+
--- a/tensorflow/core/kernels/BUILD
4+
+++ b/tensorflow/core/kernels/BUILD
5+
@@ -1184,7 +1184,7 @@ tf_kernel_libraries(
6+
"segment_reduction_ops",
7+
"scan_ops",
8+
"sequence_ops",
9+
- "sparse_matmul_op",
10+
+ #DC "sparse_matmul_op",
11+
],
12+
deps = [
13+
":bounds_check",
14+
diff --git a/tensorflow/core/kernels/cwise_op_gpu_select.cu.cc b/tensorflow/core/kernels/cwise_op_gpu_select.cu.cc
15+
index 02058a8..880a0c3 100644
16+
--- a/tensorflow/core/kernels/cwise_op_gpu_select.cu.cc
17+
+++ b/tensorflow/core/kernels/cwise_op_gpu_select.cu.cc
18+
@@ -43,8 +43,14 @@ struct BatchSelectFunctor<GPUDevice, T> {
19+
const int all_but_batch = then_flat_outer_dims.dimension(1);
20+
21+
#if !defined(EIGEN_HAS_INDEX_LIST)
22+
- Eigen::array<int, 2> broadcast_dims{{ 1, all_but_batch }};
23+
- Eigen::Tensor<int, 2>::Dimensions reshape_dims{{ batch, 1 }};
24+
+ // Eigen::array<int, 2> broadcast_dims{{ 1, all_but_batch }};
25+
+ Eigen::array<int, 2> broadcast_dims;
26+
+ broadcast_dims[0] = 1;
27+
+ broadcast_dims[1] = all_but_batch;
28+
+ // Eigen::Tensor<int, 2>::Dimensions reshape_dims{{ batch, 1 }};
29+
+ Eigen::Tensor<int, 2>::Dimensions reshape_dims;
30+
+ reshape_dims[0] = batch;
31+
+ reshape_dims[1] = 1;
32+
#else
33+
Eigen::IndexList<Eigen::type2index<1>, int> broadcast_dims;
34+
broadcast_dims.set(1, all_but_batch);
35+
diff --git a/tensorflow/core/kernels/sparse_tensor_dense_matmul_op_gpu.cu.cc b/tensorflow/core/kernels/sparse_tensor_dense_matmul_op_gpu.cu.cc
36+
index a177696..28d2f59 100644
37+
--- a/tensorflow/core/kernels/sparse_tensor_dense_matmul_op_gpu.cu.cc
38+
+++ b/tensorflow/core/kernels/sparse_tensor_dense_matmul_op_gpu.cu.cc
39+
@@ -104,9 +104,17 @@ struct SparseTensorDenseMatMulFunctor<GPUDevice, T, ADJ_A, ADJ_B> {
40+
int n = (ADJ_B) ? b.dimension(0) : b.dimension(1);
41+
42+
#if !defined(EIGEN_HAS_INDEX_LIST)
43+
- Eigen::Tensor<int, 2>::Dimensions matrix_1_by_nnz{{ 1, nnz }};
44+
- Eigen::array<int, 2> n_by_1{{ n, 1 }};
45+
- Eigen::array<int, 1> reduce_on_rows{{ 0 }};
46+
+ // Eigen::Tensor<int, 2>::Dimensions matrix_1_by_nnz{{ 1, nnz }};
47+
+ Eigen::Tensor<int, 2>::Dimensions matrix_1_by_nnz;
48+
+ matrix_1_by_nnz[0] = 1;
49+
+ matrix_1_by_nnz[1] = nnz;
50+
+ // Eigen::array<int, 2> n_by_1{{ n, 1 }};
51+
+ Eigen::array<int, 2> n_by_1;
52+
+ n_by_1[0] = n;
53+
+ n_by_1[1] = 1;
54+
+ // Eigen::array<int, 1> reduce_on_rows{{ 0 }};
55+
+ Eigen::array<int, 1> reduce_on_rows;
56+
+ reduce_on_rows[0]= 0;
57+
#else
58+
Eigen::IndexList<Eigen::type2index<1>, int> matrix_1_by_nnz;
59+
matrix_1_by_nnz.set(1, nnz);
60+
diff --git a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
261
index b2da109..8ee1f3a 100644
3-
--- tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
4-
+++ tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
62+
--- a/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
63+
+++ b/tensorflow/stream_executor/cuda/cuda_gpu_executor.cc
564
@@ -870,7 +870,10 @@ CudaContext* CUDAExecutor::cuda_context() { return context_; }
665
// For anything more complicated/prod-focused than this, you'll likely want to
766
// turn to gsys' topology modeling.

0 commit comments

Comments
 (0)