Sequencing、biological information query integration

Q1: What are the advantages of high-throughput sequencing over gene chips?

A: High-throughput sequencing has the following advantages:

1. high accuracy, high repeatability;

2. not only to detect known genes, but also to search for new genes;

3. Wider detection limits for gene expression (more accurate detection of high-abundance and low-abundance genes)

4. directly obtain the gene sequence, the original data will not be affected by the genome version number change (chip original data is probe information, may be invalid due to genome upgrade caused by annotation information changes). Therefore, sequencing is more suitable for long-term accumulation and analysis of laboratory sample data

Q2: I want to conduct research on the breeding of a certain Marine organism, but the genomic data of this Marine organism has not been published. Is it possible to screen molecular markers through transcriptome sequencing?

A: Yes, we can detect molecular markers such as SSR in individual samples through transcriptome sequencing. At the same time, teachers can also be advised to use RAD sequencing to screen molecular markers.

Q3: There is no liquid nitrogen in the laboratory at the time of sampling for transcriptome sequencing. Can the samples be stored directly in the -80℃ refrigerator?

A: No, because RNA is easily degraded when exposed to an environment rich in RNA enzymes, and needs to be quickly stored in liquid nitrogen and kept at low temperature, while the refrigerator has no quick-freezing function.

Q4: Can a species without a genome make variable shear predictions based on transcriptome data?

A: No, a reference genome is required to predict variable shear.

Q5: I got the final report of my expression profile sequencing, which shows that there are thousands of differential genes. How can I find a functional gene I am interested in? One by one?

A: If the teacher has identified the metabolic pathway of interest, you can click the kegg enrichment analysis in the concluding report, select the hyperlink of the metabolic pathway, open the gene list, and view the differentially expressed genes in the pathway. If there are still too many genes, teacher, you can click the hyperlink of the path name, and select the differential genes located upstream in the path map.

Q6: Why is the expression of differential genes in two samples more than twice different?

A: The setting of the expression difference multiple is based on the teacher’s research needs, and the purpose is to find the right number of differential genes. There is no hard index. The two-fold difference is the screening standard accepted by international journals. If the teacher thinks that the number of differential genes is too much, he can increase the difference multiplier by himself. You can consult the technical staff at Gidio for details.

Q7: Why are there no statistical results on mRNA variable splicing in your transcriptome conclusion report?

A: For species without reference genomes, the results of transcriptome sequencing cannot distinguish between variable splicing and gene families. We do not know whether two similar sequences come from the same gene, so the transcriptome sequencing results do not include the analysis of variable splicing.

Q8: How to judge the results of gene expression measurement?

A: There are the following methods: First, observe whether the sequencing and comparison results in the final report are normal; Second, under normal circumstances, the number of up-regulated and down-regulated genes obtained from the difference comparison between two samples should be basically an order of magnitude; Third, the expression amount of the genes expressed by the species composition type (house guarding gene) in the two samples can be compared, which should be basically the same under normal circumstances. We will test the results before presenting them to the teacher, who can use them with confidence.

Q9: I want to download the transcriptome sequencing results of a certain species from the ncbi database. What is the specific method?

A: In the ncbi database, transcriptome sequencing reads information is compressed into SRA format. In the ncbi search for species, sequencing information and download links can be found in the search SRA column. The downloaded data is converted to the FASTQ format required by the analysis software using the SRA conversion tool.

Q10: The species I studied had no reference genome, six samples in total. I’m going to do mixed transcriptome sequencing separately and then profile sequencing. Do you have any better ideas?

A: Teacher, your method is reasonable. Perform transcriptome sequencing on mixed samples, obtain complete transcriptome data as much as possible, and use this as a reference sequence for expression profile sequencing. But we have a better way. Teachers can choose to perform pe100 sequencing on the paired samples with a sequencing amount of 2G, and then mix the results of expression profile sequencing for assembly, and the transcriptome of transspecies can also be obtained.

Q11: I have conducted sequencing in your company and am currently writing my paper. I don’t know much about your sequencing methods and results, could you please provide relevant help?

A: We have written the chart results of our business lines and their experimental analysis methods, and translated the documents with reference to the papers published in well-known sci journals. Teachers can contact our sales staff to obtain these documents.

Q12: In lncRNA analysis, how did you analyze its regulation of genes?

A: As the current research on the functional mechanism of lncRNA is not clear, we generally adopt the principle of lncRNA cis-regulation hypothesis, and take the genes contained in the 10kb region near lncRNA as potential lncrNA-related genes. If the expression profile sequencing is conducted at the same time, We can screen these genes by the co-expression relationship between them and lncRNA, and obtain a more reliable lncRNA and gene regulatory network.

Q13: For an unknown species, how many Unigenes are generally appropriate for a transcriptome denovo project?

A: The number of unigene in the transcriptome of a species may vary significantly from species to species. Considering that each gene may produce multiple transcripts, combined with current project experience, the number of unigene in a general species is 2-4X of the number of genes.

Q14: If I plan to study a species (whose genome has not been published) and want to know the basic information about its genome, what are the ways?

A: It can be obtained by querying relevant website data, such as Animal Genome Size Database, PlantGDB,Plant DNA C-values Database, etc.

Q15: For a species that has done two independent transcriptome assemblies, how do you know the correspondence of unigene in the two results?

A: Yes, but teachers are usually concerned about a certain type of gene, and teachers often have found in a transcriptome result, so you can suggest teachers to match the gene to be queried with another transcriptome data through local blast, and then find out the corresponding unigene.

If the volume is large, we will provide a personalized analysis service for blast in large batches.

Q16: Do the RNA-seq project of multiple samples. In pair-to-pair comparison, there is a large difference in the number of up-regulated and down-regulated genes of a certain sample (two orders of magnitude). Is it normal?

A: Theoretically speaking, the total amount of RNA in a cell is relatively stable, so in general, the number of up-regulated and down-regulated genes should not differ much. If the difference is more than an order of magnitude, it can be considered to check whether the presence of rRNA in the sample or the influence of RNA contamination of other species.

Q17: If you want to do a high-throughput transcriptome sequencing project of unknown species, how should you choose the sequencing platform and the amount of data?

A: The Illumina sequencing platform occupies more than 80% of the market share due to its stable sequencing quality and high data cost performance. Data volume we generally recommend 4-8G (this is not a fixed standard, it will change with the development of the industry, and different teachers may have different needs)

Q18: Is it necessary to set biological replication for high-throughput sequencing projects?

A: In theory, setting biological repetition is a more reasonable and rigorous approach. If the teacher’s funds allow, you can suggest that the teacher do appropriate biological repetition (at least three). However, in terms of the current field of high-throughput research, considering the cost problem, the study of a single sample can also be recognized by the academic community, for example, IF the journal is less than 5, it can still accept the study of samples without duplication. But over time, biological duplication is inevitable.

Q19: The teacher wanted to open the sequencing result file, but there was no response for a long time. What happened?

A: Because some files are large, office software (word, Notepad, etc.) usually needs to read the whole file into the computer memory, which consumes computer resources, so the speed is slow or even crash. It is recommended that teachers use VIM, ultraedit and other software.

Q20: Some plugins can’t be displayed in the final report. What’s the matter?

A: These issues are due to browser and java plug-in version compatibility issues. But these things will not have a big impact on the overall interpretation of the results, such as heat maps, which generally cannot be used directly in the article, and can be redrawn later according to the teacher’s selection of specific genes.

Sequencing、biological information query integration

Sequencing、biological information query integration

PRODUCT CATEGORY