While it is known that changes in the concentration of the target TF can distort the inferred PWM [13, 14], the extent to which cooperative and indirect interactions distort the PWM is unclear. with the chemical potential ((black), 4(blue) and 6(orange). 3is the default value of the chemical potential used everywhere in the main text. Right panel: The K-L distance of the derived motif of Tye7 from the baseline motif, which is derived when the chemical potential is set to 0, is shown on the y-axis. The chemical potential is varied between 1and 6on the x-axis. (B) The fidelity of ChIP-seq is shown on the y-axis Etomoxir (sodium salt) for all regions (orange), the top 25th percentile (blue) Etomoxir (sodium salt) and bottom 25th percentile (black) of read count ratios. The error bars are the standard deviation in mean K-L distance and fidelity obtained after 10 replicates of simulation.(TIF) pcbi.1006921.s002.tif (49K) GUID:?EFB107D9-889F-44BD-AEC9-3C500AC6AA23 S3 Fig: The impact of normally distributed extraction heterogeneity on motif inference for different TFs. The heterogeneity in the extraction is assumed to follow a truncated normal distribution, with the mean increasing from left to right on the x-axis in both panels. (A) More informative TFs are less distorted by a low mean extraction efficiency. For each TF, the maximum K-L distance between the baseline motif and the motif inferred in the presence of extraction heterogeneity is computed from the curves shown in (B). The information content (in bits) of the baseline motif for each TF is shown on the x-axis. (B) Dependence of K-L distance between the inferred Etomoxir (sodium salt) and baseline motifs of each TF at different levels of genome-wide extraction heterogeneity. The coefficient of variation of the truncated normal varies from 0 (no variation, in blue) to 0.5 (green) and 1.0 (brown). The error bars are the standard deviation in the mean K-L distance computed after PWM was estimated in 10 replicates of ChIP-seq for each mean and coefficient of variation. The binding energy matrices of each TF were taken from the BEEML database. The structural class of the DNA binding domains of these TFs is listed in S1 Table.(TIF) pcbi.1006921.s003.tif (550K) GUID:?1B9E4F65-E261-48CC-B21F-0A644F7FA09C S4 Fig: The impact of normally distributed amplification ratio heterogeneity on motif inference for different TFs. The heterogeneity in the amplification ratio is assumed to follow a truncated normal distribution, with the mean increasing from left to right on the x-axis in both panels. (A) More informative TFs are less distorted by a low mean amplification ratio efficiency. For each TF, the maximum K-L distance between the baseline motif and the motif inferred in the presence of amplification ratio heterogeneity is computed from the curves shown Etomoxir (sodium salt) in B. The information content (in bits) of the baseline motif for each TF is shown on the x-axis. (B) Dependence of K-L distance between the inferred and baseline motifs of each TF at different levels of genome-wide amplification ratio heterogeneity. The coefficient of variation of the truncated normal varies from 0 Rabbit polyclonal to Caspase 3.This gene encodes a protein which is a member of the cysteine-aspartic acid protease (caspase) family.Sequential activation of caspases plays a central role in the execution-phase of cell apoptosis.Caspases exist as inactive proenzymes which undergo pro (no variation, in blue) to 0.5 (green) and 1.0 (brown). The error bars are the standard deviation in the mean K-L distance computed after PWM was estimated in 10 replicates of ChIP-seq for each mean and coefficient of variation.(TIF) pcbi.1006921.s004.tif (613K) GUID:?C9C8DBC9-D3E6-4C56-99E1-F023E049CA9E S5 Fig: Distortion of motifs of different lengths due to variation in (A) extraction efficiency, and (B) Etomoxir (sodium salt) the amplification ratio. Motifs of length 6 and 8 bp in length were generated by sub-sampling columns of the 10bp long Tye7 energy matrix, while motifs of length 12, 14 and 16 were generated by first sub-sampling 2, 4, and 6 columns and concatenating them to the Tye7 energy matrix. The motif information content (MIC) of the baseline motif (in bits) is shown in each panel. The extraction and amplification ratio is assumed to be normally distributed with the coefficient of variation set at either 0, 0.5 or 1. The error bars are the standard deviation in the mean K-L distance computed after the PWM was estimated in 10 replicates of ChIP-seq for each.