@@ -17,47 +17,47 @@ Project: Efficient Python routines for analysis on massively multi-threaded plat
1717Submitted by- Deepanshu Thakur
1818******************************
1919
20- I spend my last 3 months working on `GSoC project `_. My GSoC project was
21- related with writing the bindings of the Hydra C++ library. Hydra is a header
22- only C++ library designed and used to run on Linux platforms. Hydra is a
20+ I spent my last 3 months working on a `GSoC project `_. My GSoC project was
21+ related with writing the bindings of the Hydra C++ library. Hydra is a header-only
22+ C++ library designed and used to run on Linux platforms. Hydra is a
2323templated C++11 library designed to perform common High Energy Physics data
24- analyses on massively parallel platforms. The idea of this GSoC project is to
25- provide the bindings of the Hydra library, so that the python support for
26- Hydra library can be added and python can be used for the prototyping or
24+ analysis on massively parallel platforms. The idea of this GSoC project was to
25+ provide the Python bindings for the Hydra library, so that the Python support
26+ can be added to the overall Hydra project and Python can be used for the prototyping or
2727development.
2828
2929
3030.. _GSoC project : https://summerofcode.withgoogle.com/projects/#6669304945704960
3131
32- My original proposal deliverables and my final output looks a little bit
33- different and there are some very good reasons for it. The change of
32+ My original proposal deliverables and final output ended up looking a little bit
33+ different, and there are some very good reasons for it. The change of
3434deliverables will become evident in the discussion of the design challenges
3535and choices later in the report. In the beginning the goal was to write the
3636bindings for the ``Data Fitting ``, ``Random Number Generation ``,
3737``Phase-Space Monte Carlo Simulation ``, ``Functor Arithmetic `` and
3838``Numerical integration ``, but we ended up having the bindings for
3939``Random Number Generation `` and ``Phase-Space Monte Carlo Simulation `` only.
40- (Though remaining classes can be binded with some extra efforts but we do
40+ (The remaining classes can be binded with some extra effort but we do
4141not have time left under the current scope of GSoC, so I have decided to
42- continue with the project outside the scope of GSoC.)
42+ continue with the project outside the scope of GSoC given my interest in the project .)
4343
4444
45- Choosing proper tools
46- *********************
45+ Choosing the proper tools
46+ *************************
4747
48- Let me take you to my 3 months journey. First step was to find a tool or
49- package to write the bindings. Several options were in principle available to
50- write the bindings for example in the beginning we tried to evaluate the
51- `SWIG `_.
48+ Let me take you though my three-month journey. First step was to find a tool or
49+ package to write the bindings with . Several options were in principle available to
50+ write the bindings. For example, at the beginning we tried to evaluate the
51+ `SWIG `_ project .
5252But the problem with SWIG is, it is very complicated to use and second it
5353does not support the ``variadic templates `` while Hydra underlying
5454`Thrust library `_ depends heavily on variadic templates. After trying hands
5555with SWIG and realizing it cannot fulfill our requirements, we turned our
56- attention to `Boost.Python `_ which looks quite promising and a very large
57- project but this large and complex suite project have so many tweaks and
58- hacks so that it can work on almost any compiler but with added so many
59- complexities and cost. Finally we turned our attention to use `pybind11 `_.
60- A quote taken from pybind11 documentation,
56+ attention to `Boost.Python `_, which looked quite promising. It is a very large
57+ project; but this large and complex suite project has so many tweaks and
58+ hacks so that it can work on almost any compiler. It does add much
59+ complexity and cost. Finally, we turned our attention to the newer `pybind11 `_ project .
60+ A quote taken from the pybind11 documentation,
6161
6262 Boost is an enormously large and complex suite of utility libraries
6363 that works with almost every C++ compiler in existence. This compatibility
@@ -80,31 +80,30 @@ to go ahead with pybind11. Next step was to `familiarize myself`_ with pybind11.
8080The Basic design problem
8181************************
8282
83- Now we needed to solve the basic design problem which is the `CRTP idiom `_.
84- Hydra library relies on the CRTP idiom to avoid runtime overhead. I
83+ The basic design problem is the `CRTP idiom `_.
84+ The Hydra library relies on the CRTP idiom to avoid runtime overhead. I
8585investigated a lot about CRTP and it took a little while to finally come up
86- with a solution that can work with any number N. It means our class can accept
87- any number of particles at final states. (denoted by N) If you know about
88- CRTP, it is a type of static polymorphism or compile time polymorphism. The
89- idea that I implemented was to take a parameter from python and based on that
86+ with a solution that can work with any number of final-state particles (denoted N) often used in Hydra applications.
87+ If you know about CRTP, it is a type of static polymorphism, or compile-time polymorphism. The
88+ idea that I implemented was to take a parameter from Python and, based on that
9089parameter, I was writing the bindings in a new file, compiling and generating
91- them on runtime with system calls. Unfortunately generating bindings at
90+ them on runtime with system calls. Unfortunately, generating bindings at
9291runtime and compiling them would take a lot of time and so, it is not
93- feasible for user to each time wait for few minutes before actually be
94- able to use the generated package. We decided to go ahead with fixed number
95- of values. Means we generate bindings for a limited number of particles.
96- Currently python bindings for classes supports up to 10 (N = 10) number of
97- particles at final state. We can make that to work with any number we want,
92+ feasible for a user to each time wait for a few minutes before actually being
93+ able to use the generated package from Python . We decided to go ahead with a fixed number
94+ of values of N. It means we generate the bindings for a limited number of particles.
95+ Currently the Python bindings for the Hydra classes support up to 10 (N = 10) number of
96+ particles in the final state. Note that we can make that to work with any number we want,
9897as our binding code is written within a macro, so it is just a matter of
99- writing additional 1 extra call to make it use with extra value of N.
98+ writing additional and trivial-to-add extra calls to make the bindings work for extra values of N.
10099
101100.. _CRTP idiom : https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern
102101
103102
104- The Hydra Binding
105- *****************
103+ The Hydra bindings
104+ ******************
106105
107- Now that the approach was decided, we jump into the bindings of Hydra.
106+ Now that the approach was decided, we jumped into the bindings of Hydra.
108107(Finally after so many complications but unfortunately this was not the
109108end of them.) We decided to bind the most important classes first,
110109``Random Number Generation `` and ``Phase-Space Monte Carlo Simulation ``.
@@ -121,20 +120,20 @@ to generate the phase space monte carlo simulation.
121120 [F. James, Monte Carlo Phase Space, CERN 68-15 (1968)]
122121 (https://cds.cern/ch/record/275743).
123122
124- The Momentum and Energy units are GeV/C, GeV/C^2 . The PhaseSpace monte
125- carlo class depends on the ``Vector3R ``, ``Vector4R `` and ``Events `` classes.
123+ The momentum and energy units are GeV/c and GeV/c^2, respectively . The PhaseSpace Monte
124+ Carlo class depends on the ``Vector3R ``, ``Vector4R `` and ``Events `` classes.
126125Thus PhaseSpace class cannot be binded before without any of the above classes.
127126
128127The ``Vector3R `` and ``Vector4R `` classes were binded. There were some problems
129- like generating ``__eq__ `` and ``__nq__ `` methods for python side but I solved
130- them by creating ``lambda function `` and iterating over values and checking
128+ like generating ``__eq__ `` and ``__nq__ `` methods for the Python side but I solved
129+ them by creating ``lambda functions `` and iterating over values and checking
131130if they satisfy the conditions or not. The ``Vector4R `` or four-vector class
132- represents a particle. The idea is I first bind the particles class
131+ represents a particle. The idea is I first bound the particles class
133132(the four-vector class) than I had to bind the ``Events `` class that will
134- hold the Phase Space generated by the ``PhaseSpace `` class, and then bind the
133+ hold the Phase Space events generated by the ``PhaseSpace `` class, and then bind the
135134actual ``PhaseSpace `` class. The ``Events `` class were not so easy to bind
136135because they were dependent on the ``hydra::multiarray `` and without their
137- bindings, the ``Events `` class was impossible to bind. Thanks to my mentor
136+ bindings, the ``Events `` class was impossible to bind. Thanks to my mentors
138137who had already binded these bindings for ``Random `` class with some tweaks on
139138the pybind11’s bind_container itself. We even faced some design issues of
140139Events class in Hydra itself. But eventually after solving these problems,
@@ -165,7 +164,7 @@ After completing the PhaseSpace code, I quickly converted the code into macro
165164for supporting up-to 10 particles.
166165
167166Now the PhaseSpace class was working perfectly! Next step was to create a
168- series of test cases and documentation and of-course the example of
167+ series of test cases, documentation, and of-course the example of
169168PhaseSpace class in action. The remaining algorithms that I named at the
170169start of the article are left to implement.
171170
@@ -178,17 +177,17 @@ things not only related with programming but related with high energy physics.
178177I learned about *Monte Carlo Simulations *, and how they can be used to solve
179178challenging real life problems. I read and studied a research paper
180179( https://cds.cern.ch/record/275743/files/CERN-68-15.pdf ), learned about
181- particle decays, learned the insights of C++ varidiac templates,
180+ particle decays, learned the insights of C++ variadic templates,
182181wrote a blog about `CRTP `_, learned how to compile a
183- python function and why simple python functions cannot be used in
182+ Python function and why simple Python functions cannot be used in
184183multithreaded environments. Most importantly I learned how to structure
185184a project from scratch, how important documentation and test cases are.
186185
187186
188187.. _CRTP : https://medium.com/@deepanshu2017/a-curiously-recurring-python-d3a441a58174
189188
190189
191- Special Thanks
190+ Special thanks
192191**************
193192
194193Shoutout to my amazing mentors. I would like to thank
0 commit comments