SDC, Data Science and Knowledge

Difference between revisions of "Ali Ayadi"

From SDC, Data Science and Knowledge
Jump to navigation Jump to search
Line 28: Line 28:




A complex biomolecular network is represented by the interactions of many molecules (genes, proteins and metabolites) in a cell. This network should stay at a normal (at least, healthy) phenotype. However, by some unknown perturbation or stimuli, the network can be transited from a normal phenotype to a disease phenotype. It thus is desirable to steer the biomolecular network to transit from the abnormal phenotype to a healthy phenotype.
Many complex systems of scientific interest can be represented as networks in which a set of nodes are connected in pairs by edges or arcs. Because of the interactions among nodes in a network, perturbing some nodes can affect other nodes, which may cause the state transition of a network. In reality, we have often observed some unexpected state transitions of a complex system (for example, from a normal sate to an abnormal state). Here we are interested in how to effectively steer the system from an unexpected state to a desired state by applying suitable input control signals. The main purpose of this project is to provide a platform based on two strong points of the Data Mining Theme of the SDC team of ICube: semantic technologies on the one hand, and combinatorial optimisation tools on the other hand.
Here we are interested in how to effectively steer the system from an unexpected state to a desired state by applying suitable input control signals. The main purpose of this project is to provide a platform based on two strong points of the Data Mining Theme of the SDC team of ICube: semantic technologies on the one hand, and combinatorial optimisation tools on the other hand.


To study the phenotype transitions, we will propose a modelisation that describe the a regulatory biomolecular network is represented by a directed network in which molecules are represented by nodes and the interactions between molecules are represented by arcs2,3,4,5,6,15,16,17. As a result, cellular phenotypes can be defined by the network states that represent all the molecular expressions in the network collectively while a phenotypic change or cellular behavior change can be described as a dynamic transition between two states of the network,

The general goal of this thesis is to find an optimal set of external stimuli to be applied during a predetermined time interval to evolve the network from its current state to another
desired state. Our original approach is based on the combined use of semantic technologies, combinatorial optimization and simulation.

With this aim in mind, our future work will continue to develop a platform to study the transitions of biomolecular networks from any state to a specific state, based on three modules:
1. The ontological module: This module uses semantic technologies to generate new inferred knowledges (the discovery of new semantic associations between molecules) to
refine the transitions study of the network behavior. The input of this module is a set of native data (network states and transitions in the form of values and parameters) intro-
duced by the expert and as a result provides the inferred network composed by native and inferred transition states. This enrichment by metadata and new knowledges will
facilitate decision making thanks to a powerful knowledge management.
2. The simulation module: This module will reproduce over time the dynamic behavior of each network component. This simulator will adopt the DEVS Discrete Event Specification Formalism.
3. The optimization module: With this module, we apply combinatorial optimization algorithms to provide a set sequences of transitions offering the best control of the network from one state to another, at the same time describing all the changes in values taking place inside each network component.






Revision as of 16:08, 19 February 2016

PhD student in the SDC team (formerly BFO team) of the ICube laboratory of the University of Strasbourg since May 2015.

Contact

Ali AYADI
ICube Laboratory
Télécom Physique Strasbourg
300 bd Sébastien Brant - CS 10413
F - 67412 Illkirch cedex
Office: C325
Phone: +33 (0) 3 68 85 45 78
Email: ali.ayadi (at) unistra (dot) fr

Research

PhD Thesis

Title: Semantic technologies for the optimization of complex molecular networks

Promotor: Cecilia Zanni-Merk (Senior Tenured Associate Professor, ICube-SDC and INSA Strasbourg)

Co-advisor: François de Bertrand de Beuvron (Tenured Associate Professor, ICube-SDC and INSA Strasbourg) and Julie Thompson (Research director CNRS, at ICube-CSTB)


Overview: This PhD thesis is prepared as part of a Franco-Tunisian cotutelle between, the SDC Team in collaboration with the LBGI Team and the University of Tunis. This project is intended to develop a platform for the optimisation of the transittability of complex biomolecular networks by offering transitions steering mechanisms in order to steer these complex networks from an unexpected state to a desired state.


A complex biomolecular network is represented by the interactions of many molecules (genes, proteins and metabolites) in a cell. This network should stay at a normal (at least, healthy) phenotype. However, by some unknown perturbation or stimuli, the network can be transited from a normal phenotype to a disease phenotype. It thus is desirable to steer the biomolecular network to transit from the abnormal phenotype to a healthy phenotype.

Here we are interested in how to effectively steer the system from an unexpected state to a desired state by applying suitable input control signals. The main purpose of this project is to provide a platform based on  two strong points of the Data Mining Theme of the SDC team of ICube: semantic technologies on the one hand, and combinatorial optimisation tools on the other hand.


To study the phenotype transitions, we will propose a modelisation that describe the a regulatory biomolecular network is represented by a directed network in which molecules are represented by nodes and the interactions between molecules are represented by arcs2,3,4,5,6,15,16,17. As a result, cellular phenotypes can be defined by the network states that represent all the molecular expressions in the network collectively while a phenotypic change or cellular behavior change can be described as a dynamic transition between two states of the network,

The general goal of this thesis is to find an optimal set of external stimuli to be applied during a predetermined time interval to evolve the network from its current state to another desired state. Our original approach is based on the combined use of semantic technologies, combinatorial optimization and simulation.

With this aim in mind, our future work will continue to develop a platform to study the transitions of biomolecular networks from any state to a specific state, based on three modules: 1. The ontological module: This module uses semantic technologies to generate new inferred knowledges (the discovery of new semantic associations between molecules) to refine the transitions study of the network behavior. The input of this module is a set of native data (network states and transitions in the form of values and parameters) intro- duced by the expert and as a result provides the inferred network composed by native and inferred transition states. This enrichment by metadata and new knowledges will facilitate decision making thanks to a powerful knowledge management. 2. The simulation module: This module will reproduce over time the dynamic behavior of each network component. This simulator will adopt the DEVS Discrete Event Specification Formalism. 3. The optimization module: With this module, we apply combinatorial optimization algorithms to provide a set sequences of transitions offering the best control of the network from one state to another, at the same time describing all the changes in values taking place inside each network component.


Relational data mining is a subfield of data mining where data is not represented according to the classic attribute-value model, in which every row of a single table would represent a training instance of a model with its properties, including the attribute to predict. Here, data is represented by several tables linked with foreign keys, which represent the different kinds of objects constituting the problem. A table, called the main table, contains the training instances (for instance, molecules) with the attribute to learn and other tables (for instance a table of the atoms constituting the molecules) contain the secondary objects linked to the main ones. We intend to take into account the properties of such secondary objects in the learning process on the main objects. A way to do so, in which we are more particularly interested, is the use of complex aggregates. They constitute a way to aggregate the secondary objects linked to one main object that meet a certain condition. More intuitively, the allow to summarize in one value the secondary table. Two examples of such an aggregate would be the number of carbon atoms in the molecule, or the average charge of the oxygen atoms of the molecule. However, the number of possibilities for the aggregate condition and the aggregate function make the exhaustive generation of all complex aggregates intractable. One of the goals of the PhD thesis is to propose a heuristic allowing to explore the complex aggregate space and to generate incrementally the ones that are relevant to address the given problem.

The other domain on which this PhD thesis focuses on is multi-class cost-sensitive learning. In this kind of problem, the attribute to learn can take many values, i.e. more than 2, contrary to the binary problems for which many learning algorithms are designed. Moreover, all the classification errors do not have the same cost, as expected in a medical domain, where diagnosing a disease for a sane patient will not have the same impact as not diagnosing the disease for a sick patient. In this framework, we are particularly interested in to binarization approaches, which consist in reducing a multi-class problem into several binary problems. More particularly, we consider the case where the binarization uses scorers, the scores being used to set decision thresholds between the two classes of the binary subproblems.

Teaching

Teaching assistant at the UFR Mathématiques-Informatique (department of Mathematics and Computer Science) and at the Faculté de Géographie et d'Aménagement (University Institute of Technology) of the University of Strasbourg.

2014/2015:

  • L1/MathInfo Computer Science S1 :Computer and internet certificate (C2i)
  • Master1/GE-OTG Computer Science S2: Spatial databases and SQL (PostgreSQL)
  • L1/MathInfo Computer Science S2:Databases and SQL (Oracle)
  • L1/MathInfo Computer Science S2: Object-Oriented Programming (Ocaml)

Publications

<anyweb>http://icube-publis.unistra.fr/?author=AYadi_Ali&=#hideMenu</anyweb>