## Methods

The idea behind LIPEA is to identify specific altered pathways - provided by the KEGG Database - using exclusively lipid compounds. The approach used to this task is the Over Representation Analysis (ORA). ORA starts with considering a list of annotated lipids (e.g. a lipid set related with a signature), then uses the

Fisher exact testto verify if the annotations are over represented among a label (pathway) compared to the whole universe of lipids (background), that could be selected as “predefined” for a specific organism (it means, LIPEA will take all the compounds from the pathways related with the selected organism) or be a custom list given by the user.

### Procedure

The steps of the algorithm used to implement the ORA are the following.

## Step 1

Set an organism, collect the lipid list and the background.

## Step 2

Select a pathway to start with.

## Step 3

Tally the following 4 numbers:

m,N,k, andn, wheremis the total number of lipids in the pathway,Nis the total number of lipids,kis the number of lipids of the intersection between the lipid list and a pathway, andnis the total number of lipids in the list.## Step 4

Perform a Fisher exact test, with the 4 numbers obtained in the preview step, as follows:

$$ f(k;N,m,n) = \frac{\binom{m}{k} \binom{N - m}{n - k}}{\binom{N}{n}} $$The

$$ p = \sum_{l = k}^n f(l;N,m,n) $$fvalue is the probability that this random event could happen under the hypergeometric distribution. In this case, to obtain thep-value to associate to each pathway, the following formula is used:## Step 5

Go to step 2 for another pathway of interest, until all are tested.

## Step 6

Correct the

p-values with Benjamini or Bonferroni-Holm corrections.