MY SCIENTIFIC AND TECHNICAL PRODUCTION

I can summarize my HPC experience as a non linear trip from problem formulation to efficient implementation on various platforms, with different algorithmic and computation paradigms in between. Keep in mind that there is always an effort (typical and/or personal) to switch for a moment to another fundamental approach or context application. At the price of the consequent step-by-step intellectual and time overhead, due to the need to move from an intermediate level to a more mature state for each, the trip reveals rewarding for me so far, since all these skills are indeed additive. By the way, although being a general consensus, learning and investigating really appear incremental to me.

Back to Homepage

 

  Journal Papers (Display summaries)
  1. Claude Tadonki, and Bernard Philippe, Parallel multiplication of a vector by a Kronecker product of matrices, Parallel Distributed Computing Practices PDCP, volume 2(4), 1999.
    Abstract. We provide a parallel algorithm for the Kronecker product of matrices based on a cyclic partition of loops. Our general model unifies one scheme with redundant computations but no communication and an opposite scheme without redundant computation but with interprocessor communications. Both schema can be mixed with a certain balance influenced by the volume of floating point operations and the characteristics of the target architecture. This work has valuated experimental validations ({\small CENJU and INTEL PARAGON}).
  2. Claude Tadonki, and Bernard Philippe, Parallel multiplication of a vector by a Kronecker product of matrices (part II), Parallel Distributed Computing Practices PDCP, volume 3(3), 2000.
    Abstract. This paper presents a revised and generalized version of our previous parallel algorithm for the Kronecker product of matrices. We show how to map the computation on any number of processors that is a divisor of the product of the matrices sizes (instead of being a prefix of this product as previously). Moreover, we show that the minimum number of parallel communications with p processors is log(p) whatever the algorithm, and that our algorithm achieves this optimal performance. The work is validated by significant efficiencies obtained from experimental measures on the {\small CRAY} machine.
  3. Sanjay Rajopadhye, Tanguy Risset, et Claude Tadonki, Algebraic Path Problem on linear arrays, Techniques et Sciences Informatiques  TSI, 20 (5), 2001.
    Abstract. We seek a linear SPMD implementation of the Warshal algorithm for the Algebraic Path Problem (Unified model of the transitive closure, shortest paths, Gauss elimination, ...). Our parallel algorithm is systolic-like in its original version, and offers the important advantage of being able to run on a shorter number of processors by a natural round- robbing remapping. We derive and validate a blocked version for standard distributed memory parallel machines, as the cost of data communication would be severe otherwise. We show experimental validations on the {\small CRAY} machine.
  4. Claude Tadonki, A Recursive Method for Graph Scheduling, Scientific Annals of Cuza University, Vol 11, p.121-131, 2002
    Abstract. This paper presents a recursive graph scheduling method. Our paradigm applies on any acyclic graph that can be partitioned into isomorphic subgraphs. Indeed, many common problem domains are 3D graphs that can be organized as a chain of isomorphic 2D subgraphs. Starting from any valid schedule of the source subgraph of the chain, we use the generic isomorphism to systematically derive a valid global schedule, with a local pipeline between two consecutive subgraphs. Our technique is then characterized by the nature of the partition and the source schedule. We discuss the impact of these characteristics on the complexity on the generated parallel schedules.
  5. C. Beltran, C. Tadonki and J.-Ph. Vial, Solving the p-median problem with a semi-Lagrangian relaxation, Computational Optimization and Applications, Volume 35(2), October 2006. (pdf)
    Abstract. This paper deals with operation research and non differentiable optimization. The so-called P-median problem is the problem of locating P "facilities" relative to a set of "customers" such that the sum of the shortest demand weighted distance between "customers" and "facilities" is minimized. Indeed, this a classical combinatorial optimization problem with a huge set of potential solutions. Using a semi-Lagrangian relaxation, we tackle the problem in its associated continuous formulation and report our non-smooth convex optimization engineering results.
  6. F. Babonneau, C. Beltran, A. Haurie, C. Tadonki and J.-P. Vial, Proximal-ACCPM: a versatile oracle based optimization method, Computational and Management Science, Volume 9, 2007. (pdf)
    Abstract. Oracle Based Optimization (OBO) conveniently designates an approach to handle a class of convex optimization problems in which the information pertaining to the function to be minimized and/or to the feasible domain takes the form of a linear outer approximation revealed by an oracle. We show how difficult problems can be cast in this format, and then solved within our context. We present our method, so-called Proximal-ACCPM, to trigger the OBO approach and give a snapshot on numerical results. This paper summarizes several contributions with the OBO approach and aims to give, in a single report, enough information on the method and its implementation to facilitate new applications.
  7. C. Tadonki, Mathematical and Computational Engineering in X-Ray Crystallography, International Journal of Advanced Computer Engineering, volume 1(2) 2008.
    Abstract. The main purpose of X-Ray Crystallography is to predict a macromolecular structure using from X-Ray synchrotron radiation. Among existing paradigms to achieve this task, analytical approaches come up as good candidates for automation trough mathematical approximations and computational engineering. In addition to the later, statistical processing are required in order to refine the data according to the physical model and the boarding effects of the experiments. We revisit the basis of the problem and focus on a more precise effect, so-called radiation damage, for which it has been also proven that it can be artificially managed to become instructive.
  8. T. Saidani, L. Lacassagne, J. Falcou, C. Tadonki, Samir Bouaziz, Parallelization Schemes for Memory Optimization on the Cell Processor : A Case Study on the Harris Corner Detector, Transactions on High-Performance Embedded Architectures and Compilers, volume 3(3) 2008.
    Abstract. The Cell processor is a typical example of heterogeneous multiprocessor on-chip architecture that uses several levels of parallelism to deliver high performance. Although its efficiency potential, the execution mode and part of hardware specificities make it being non trivial to deal with. Indeed, reducing the gap between peak performance and effective performance is the challenge for compiler design and efficient implementations. Image processing and media applications are typical "main stream" applications one could consider while investigating on Cell benchmarks. Our investigations, trough various implementation of the Harris detection algorithm, reveal that the impact of DMA controlled data transfers and synchronizations between SPEs are key points for global performance.
  9. C. Tadonki, Integer Programming Heuristic for the Dual Power Setting Problem in Wireless Sensors Networks, Int. Journal of Advanced Research in Computer Engineering, vol 3(1) 2009.
    Abstract. We seek an integer programming based heuristic for solving the dual power management problem in wireless sensor networks. For a given network with two possible transmission powers (low and high), the problem is to find a minimum size subset of nodes such that if they are assigned high transmission power while the others are assigned low transmission power, the network will be strongly connected. The main purpose behind this efficient setting is to minimize the total communication power consumption while maintaining the network connectivity. In a theoretical point of view, the problem is known to be difficult to solve exactly. An approach to approximate the solution is to work with a spanning tree of clusters. Each cluster is a strongly connected component when consider low transmission power. We follow the same approach, and we formulate the node selection problem inside clusters as an integer programming problem which is solved exactly using specialized codes. Experimental results show that our algorithm is efficient both in execution time and solution quality.
  10. C. Tadonki, G. Grodidier, O. Pene, An efficient CELL library for lattice quantum chromodynamics, ACM SIGARCH Computer Architecture News, vol 38(4) 2011.
    Abstract. Quantum chromodynamics (QCD) is the theory of subnuclear physics, aiming at modeling the strong nuclear force, which is responsible for the interactions of nuclear particles. Numerical QCD studies are performed through a discrete formalism called LQCD (Lattice Quantum Chromodynamics). Typical simulations involve very large volume of data and numerically sensitive entities, thus the crucial need of high performance computing systems. We propose a set of CELL-accelerated routines for basic LQCD calculations. Our framework is provided as a unified library and is particularly optimized for an iterative use. Each routine is parallelized among the SPUs, and each SPU achieves it task by looping on small chunk of arrays from the main memory. Our SPU implementation is vectorized with double precision data, and the cooperation with the PPU shows a good overlap between data transfers and computations. Moreover, we permanently keep the SPU context and use mailboxes to synchronize between consecutive calls. We validate our library by using it to derive a CELL version of an existing LQCD package (tmLQCD). Experimental results on individual routines show a significant speedup compare to standard processor, 11 times better than a 2.83 GHz INTEL processor for instance (without SSE). This ratio is around 9 (with QS22 blade) when consider a more cooperative context like solving a linear system of equations (usually referred as Wislon-Dirac inversion). Our results clearly demonstrate that the CELL is a very promising way for high-scale LQCD simulations.
 Conférence/Symposium/Workshop Papers (grouped by topic)

1. Parallel Computing (algorithm, scheduling, complexity, implementation, dynamic system)
Overview. This subset of my outputs is related to general purpose parallel computing. It includes adhoc parallel algorithms, methodology for systematic parallel scheduling, efficient parallel implementation, and parallel dynamic systems. My actual focus in this topic is on the design and analysis of powerful methodologies specific to multicore processors and accelarators based architectures (CELL, GPU, ...), both for domain specific considerations and a wider audience. I keep investigating on fundamental aspects, since new hypothesis came up with emergent architectures and the increasing and pervasive HPC demand.
  1. Claude Tadonki,
    Système d'équations récurrentes et multiplication parallèle d'un vecteur par un produit tensoriel de matrices,
    Rencontres Francophones de Parallelisme Renpar'11, Rennes (France), 1999.
  2. Sanjay Rajopadhye, Tanguy Risset, et Claude Tadonki,
    The algebraic path problem revisited,
    European Conference on Parallel Computing  Europar99, Toulouse (France), Lncs Sringer-Verlag, N° 1685, p. 698-707, August 1999.
  3. Claude Tadonki,
    Ordonnancements canoniques,
    Renpar12
    , Rencontres Francophones de Parallelisme, Besançon (France), Juin 2000.
  4. Claude Tadonki,
    Parallel Cholesky Factorization,
    Parallel Matrix Algorithms and Appliations  PMAA Worshop, Neuchatel (Switzerland), August 2000.
  5. Claude Tadonki, et Bernard Philippe,
    Méthodologie de conception d'algorithmes efficaces pour le produit tensoriel,
    CARI2000, Tananarive (Madagascar), Octobre 2000.
  6. Patrice Quinton, Claude Tadonki, et Maurice Tchuente,
    Un échéancier systolique et son utilisation dans l'ATM,
    CARI2000, Tananarive (Madagascar), Octobre 2000.
  7. Claude Tadonki,
    Complexité des ordonnancements canoniques et dérivation d'architecture,
    Rencontres Francophones de Parallelisme Renpar13, Paris (France), Avril 2001 ( get it! ).
  8. Claude Tadonki,
    A Recursive Method for Graph Scheduling,
    International Symposium on Parallel and Distributed Computing (SPDC), Iasi, Romania, July 2002 ( get it! ).
  9. R. Ndoundam, C. Tadonki, and M. Tchuente,
    Parallel chip firing game associated with n-cube orientation,
    International Conference on Computational Science, ICCS04 (LNCS/Springer), Krakow, Poland, June 2004 .
  10. T. Saidani, J. Falcou, C. Tadonki, L. Lacassagne, and D. Etiemble,
    Algorithmic Skeletons within an Embedded Domain Specific Language for the CELL Processor,
    Parallel Architectures and Compilation Techniques (PACT), PACT09, Raleigh, North Carolina (USA), September 12-16, 2009. (pdf)
  11. C. Tadonki, G. Grosdidier, and O. Pene,
    An efficient CELL library for Lattice Quantum Chromodynamics,
    International Workshop on Highly Efficient Accelerators and Reconfigurable Technologies (HEART) in conjunction with the 24th ACM International Conference on Supercomputing (ICS), pp. 67-71, Epochal Tsukuba, Tsukuba, Japan, June 1-4, 2010. (ACM Computer Architecture News)
  12. C. Tadonki, L. Lacassagne T. Saidani, J. Falcou, K. Hamidouche,
    The Harris algorithm revisited on the CELL processor ,
    International Workshop on Highly Efficient Accelerators and Reconfigurable Technologies (HEART) in conjunction with the 24th ACM International Conference on Supercomputing (ICS), pp. 97-100, Epochal Tsukuba, Tsukuba, Japan, June 1-4, 2010. (ACM Computer Architecture News)
  13. C. Tadonki,
    Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine,
    Workshop on Applications for Multi and Many Core Architectures (WAMMCA 2010) in conjunction with the International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010), Petropolis, Rio de Janeiro, Brazil, October 27-30, 2010. (IEEE digital library) - abstract - slides - pdf - code
  14. C. Tadonki,
    Large Scale Kronecker Product on Supercomputers,
    2nd Workshop on Architecture and Multi-Core Applications (WAMCA 2011) in conjunction with the International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2011), Petropolis, Rio de Janeiro, Brazil, October 27-30, 2010. (IEEE digital library) - abstract - slides - pdf - code

2. Operation Research (algorithm, modeling, method, tool)
Overview. The main concern here is operation research and convex optimization. My work on this topic covers the design and implementation of efficient solvers for both continuous optimization and discrete optimization. The connection between the two universes through geometric, analytic, and algebraic techniques (global optimization, semidefinite programming, spectral theory, ...) is something which makes the topic very exciting, since this synergy has proven to be a good way of tackling difficult combinatorial problems.
  1. L. Drouet, A. Dubois, A. Haurie and C. Tadonki,
    A MARKAL-Lite Model for Sustainable Urban Transportation,
    Optimization days, Montreal, Canada, May, 2003. 
  2. Claude Tadonki,
    ProxAccpm: A convex optimization solver,
    International Symposium on Mathematical Programing, ISMP2003, Copengagen, Danmark, August 2003 ( get it! ).
  3. O. Briant, C. Lemaréchal,K. Monneris,N. Perrot,C. Tadonki,F. Vanderbeck,J.-P. Vial,C. Beltran,P. Meurdesoif,
    Comparison of various approaches for column generation,
    Eigth Aussois Workshop on Combinatorial Optimization, 5-9 january 2004.
  4. Claude Tadonki and Jean-Philippe Vial,
    Efficient algorithm for linear pattern separation,
    International Conference on Computational Science, ICCS04 (LNCS/Springer), Krakow, Poland, June 2004 .
  5. Cesar Beltran, Claude Tadonki, Jean-Philippe Vial,
    Semi-Lagrangian relaxation ,
    Computational Management Science Conference and Workshop on Computational Econometrics and Statistics, Link, Neuchatel, Switzerland, April 2004 .
  6. Claude Tadonki, Cesar Beltran and Jean-Philippe Vial ,
    Portfolio management with integrality constraints,
    Computational Management Science Conference and Workshop on Computational Econometrics and Statistics, Link, Neuchatel, Switzerland, April 2004 .
  7. C. Beltran, C. Tadonki and J.-Ph. Vial,
    The p-median problem solved by semi-Lagrangian relaxation,
    First Mathematical Programming Society International Conference on Continuous Optimization (ICCOPT I), Troy, USA, August 2-4, 2004.

3. Scientific and Technical Computing (Sensors network, power aware computing, program comprehension, data analysis)
Overview. This group is related to specialized algorithms, program optimization and data analysis. My investigations on sensors networks mainly focus on the network topology (disk graph) and the cooperation among sensors (distributed algorithms). Concerning the power aware computing topic, the problem is to reduced the energy dissipated by the execution of a given program, particularly in a context where the energy is a critical resource (embedded systems). My contribution includes combinatorial and analytical methodologies to achieve the task of modeling energy complexity and how to reschedule the algorithm accordingly. Regarding data refinement, this is related to statistical and approximation approaches to improve the matching between experimental data and the model. Further steps in experimental research are sensitive to this agreement.
  1. Claude Tadonki, Mitali Singh, Jose Rolim and Viktor K. Prasanna,
    Combinatorial Techniques for Memory Power State Scheduling in Energy Constrained Systems,
    Workshop on Approximation and Online Algorithms
    (WAOA), WAOA2003 (LNCS/Springer), Budapest, Hungary, September 2003 .
  2. Claude Tadonki and Jose Rolim ,
    An analytical model for energy minimization,
    III Workshop on Efficient and Experimental Algorithms, WEA04 (LNCS/Springer), Angra dos Reis, Rio de Janeiro, Brazil, May 2004.
  3. Claude Tadonki ,
    Universal Report: A Generic Reverse Engineering Tool ,
    12th IEEE International Workshop on Program Comprehension, IWPC 2004 (IEEE), University of Bari, Bari, Italy , June 2004 .
  4. Claude Tadonki and Jose Rolim,
    An integer programming heuristic for the dual power management problem in wireless sensor networks,
    2nd International Workshop on Managing Ubiquitous Communications and Services, MUCS2004, Dublin, Ireland, December 13, 2004.
  5. Claude Tadonki,
    Refinement experiments with RADDAM data,
    EMBL bilateral meeting, Hamburg, Germany, June 26-28, 2006.
  6. Claude Tadonki,
    Off-line settings in wireless networks,
    3rd International Symposium on Computational Intelligence and Intelligent Informatics, ISCIII2007, Agadir, Morocco, March 28-30, 2007.
 

 Technical Reports
  1. Claude Tadonki, and Bernard Philippe, Parallel multiplication of a vector by a Kronecker product of matrices, IRISA report n° 1194, 1998.
    Optimal Parallel Algorithm for the Kronecker Product in log(p) communication steps
  2. Patrice Quinton, Claude Tadonki, et Maurice Tchuente,  Un échéancier systolique et son utilisation dans l'ATMIRISA report n° 1348, 2000.
  3. Claude Tadonki, Synthèse d'ordonnancements parallèles par reproduction canonique, IRISA report n° 1349, also INRIA report n° 3996, 2000.
  4. David Cachera, Sanjay Rajopadhye, Tanguy Risset, and  Claude Tadonki, Parallelization of the algebraic path problem on linear simd/spmd arrays, IRISA report n° 1409, 2001.
  5. Claude Tadonki and Jean-Philippe Vial, The linear separation problem revisited with accpm , Cahier de Recherche n° 2002.11, University of Geneva, June 2002. ( get it! )
  6. F. Babonneau, C. Beltran, O. du Merle, C. Tadonki and J.-P. Vial, The proximal analytic center cutting plane method, Technical report, Logilab, HEC, University of Geneva, 2003.
  7. Cesar Beltran, Claude Tadonki, and Jean-philippe Vial, Semi-Lagrangian relaxation, Technical report, Logilab, HEC, University of Geneva, 2004.
  Book Chapters
  1. Ordonnancement pour l'informatique parallele, A. Moukrim and C. Picouleau (Edt), Hermes, ( details! ).
 Unpublished/Submited material
  1. A. Dubois, A. Haurie, C. Tadonki, and D. Zachary, An Operational Energy Modelling System, 2003 ( get it! )
  2. R. Ndoundam, C. Tadonki, and M. Tchuente, Parallel chip firing game associated with n-cube orientation, 2000 ( get it! ).
  3. F. Babonneau, C. Beltran, O. du Merle, C. Tadonki, and J.-P. Vial, The proximal analytic center cutting plane method , 2003 ( get it! ).
  4. C. Tadonki, Universal Report: A generic reverse engeneering tool, 2003 ( get it! ).
  5. C. Tadonki, M. Singh, R. Jose, and V. Prasanna, Combinatorial technic for memory power state scheduling in energy-constrained system, 2003 ( get it! ).
  6. D. Cachera, S. Rajopadye, T. Risset, and C. Tadonki,  Algorithmic tiling for efficient parallel APP implementation, 2003 ( get it! ).
  7. C. Tadonki and J.-P. Vial,  Portfilio Selection with Cardinality and Bound Constraint, 2003 ( get it! ).
  8. C. Tadonki and J. Rolim,  An analytical model for energy minimization, 2003 ( get it! ).
  9. C. Tadonki and J. Rolim,  An integer programming heuristic for the dual power management problem in wireless sensor networks, 2004 ( get it! ).
 Work in progress
  1. Portfolio Selection with Cardinality Constraints
  2. Efficient Matrix Computations in Cutting Planes Algorithms
  3. Algorithmic Technics of Designing Energy Efficient Algorithms
  4. Structural Method for Lower Bounds in Complexity
  5. Improved Graph Model for Optical Network of Sensors
  6. Dynamic Behavior of Parallel Chip Firing Game on Regular Graphs
  7. Chapter of a book on "Parallel Scheduling" published by Hermes
  8. Book in "Scientific Computation"
How to use CPLEX with Matlab

Softwares

Download my CV in PDF