Chapter 4 This is the hard disk!
Cheng Xiang stared at the screen carefully. The base pairs on the screen were marked with a question mark from time to time, and from time to time there was a red missing substitute symbol, either sparse or dense. The substitute symbol with a question mark was a base pair that could not be determined when Mars scanned. The red missing substitute symbol was because the data transmitted from Mars to the Earth was lost. It took several minutes for a long distance, just a signal was arrived. It was too expensive to transmit without losing data.
With current technical capabilities, there is no way to achieve an efficient confirmation and retransmission mechanism. It is very good to have this time stamp in front of you to indicate that there is no signal received.
"There are many mistakes." Cheng Xiang sighed, but there was no way to do it. This is how it is done in research. There are not so many ideal conditions. "However, it is already quite good to have so much information."
"Satisfied."
After all, the most important task in the early stage is to initially determine the value of this box, to be precise, the value of the genes in this box.
For the project file, on the left, there is a tree structure on the active panel, with ten subnodes inside, representing the gene sequencing vector diagram documents from boxes 1 to 10. Double-click to open the corresponding data view. This engineering software was developed by Galaxy Research Institute.
With Cheng Xiang's inspection, a number of professional biology experts and professors have been constantly giving their opinions, and this software is naturally very smooth and reliable in use.
The most important job now is to determine the research value of this gene. This kind of puzzle-like work has always been the most difficult. The feeling is like telling you that there is a treasure in the mountain ahead, but you don’t know where to bury it, and you have to find it yourself.
This is the best thing to look for a needle in a haystack.
Cheng Xiang did not immediately check the details of the genes and thought about it, "The box found has a total of 49 small boxes of the same shape inside."
"This is also a clue, since that's the case."
Cheng Xiang remembered a function of this software, which is also a very convenient function, which is gene sequencing and comparison.
Like commonly used document string comparison, gene comparison is to compare different DNA sequencing documents, and the comparison is whether the arrangement and combination of base pairs are consistent. If they are inconsistent, markings will be made one by one. After the comparison is completed, press the shortcut key F3 to quickly view each different point in turn.
"Let's compare first."
"Maybe something will be discovered."
The most common exploration process is trial and error. Trial and error, and naturally starts with the lowest cost. Nothing is simpler than this. Thinking of this, Chen Xiang will immediately use the mouse, select all documents, right-click to select to compare all. The progress bar will be expanded immediately, and the comparison results will be displayed in real time.
A 200MB DNA map is enough to store 1.6 billion base pair information. Ten DNA sequencing maps are compared at the same time, and computers with average computing volume cannot hold on. However, thanks to the software requirements being proposed by a group of professional biological researchers, the algorithm has long taken into account the particularity of biological science. The comparison is carried out in incremental ways, starting immediately, and real-time results are issued immediately, and it can be terminated at any time.
Moreover, not to mention, a supercomputer has been installed inside the laboratory building, and complex calculations can be entrusted to supercomputers for processing. With the computing speed of 10 per second per second of the supercomputer to the power of 17, the comparison of a mere 20 billion data volume is not worth mentioning at all.
The progress bar passed almost in a blink of an eye, and the result of the comparison came out.
Cheng Xiang stared at the screen. On the screen, a DNA base comparison view has been drawn. The parts of the ten views that overlap will be printed in white. If there are differences, they will be marked in yellow. Click on it to see the actual situation of the gene when each numbered view is at this position.
However, due to the relationship between display and memory, the monitor can only display the results of one interval at the same time.
This range is divided into more than 100,000 segments.
Cheng Xiang directly looked at the statistical results next to him.
Seeing this, Cheng Xiang immediately put down the wolfberry tea at his mouth and his expression became particularly solemn.
Coarse test, overlap rate is 45%.
Remove missing fragments, overlap rate is 60%.
Remove the lost fragments, and the overlap rate is 99:999%!
"The overlap rate is 99:999%!"
Cheng Xiang took a deep breath, picked up the inside phone next to him, and dialed Hou Zhijie's office phone. The phone rang twice and was picked up.
"Director Hou, come on, I think I've discovered it."
Hou Zhijie on the other end of the phone did not hesitate. She had just distributed the data she had just received to various groups. Before she could do anything else, Cheng Xiang had already made a breakthrough. She would doubt the other party's professionalism when she was on others. However, if Cheng Xiang was on her, she had long been used to Cheng Xiang's quickness, and she hung up the phone without saying a word and rushed over immediately.
"Director Cheng." Hou Zhijie arrived.
"Director Hou, look at the statistics results!" Cheng Xiang made aside and let Hou Zhijie watch the comparison results. Hou Zhijie immediately leaned over, and the statistics results were very clear. The overlap rate of the five nines was particularly conspicuous.
"Five nines? If we eliminate the error of signal being disturbed in the middle due to remote transmission." Hou Zhijie was surprised. She thought it would be a very arduous exploratory task, but who would have thought that there would be a substantial breakthrough so quickly.
"That's right, under the current conditions, we can make a preliminary assumption, that is, the DNA sequences in these forty-nine boxes are completely consistent." Cheng Xiang nodded and said.
"That is to say, the function of these forty-nine boxes is a backup and fault tolerance measure?" Hou Zhijie immediately thought of a possibility.
"It should be true." Cheng Xiang affirmed Hou Zhijie's statement.
"But why do you need to do this?" Hou Zhijie asked. A question that concludes with a question can always bring more problems. This is the complexity of the research and the joy of it.
"If ten kilograms of DNA are spread to forty-nine boxes, and if it is estimated at fifty, the DNA weight of each box is 200 grams. What exactly will these two hundred grams of genes be?" The backup conjecture was determined, the scope of research was narrowed, and the research process was accelerated. However, for the real evaluation of value, it was still necessary to determine what was stored in the gene.
"The DNA of a single organism cannot be that large. The DNA weight of a single cell in a human body is only 3 pico. I now have a guess that the meaning of these genes may not be some kind of biological gene, but use the characteristics of DNA sequences to store data as a storage medium."
"The biology community has long been successful in the laboratory in storing binary data in base pair sequences. So, in theory, this is absolutely feasible."
"If I really want to give a guess, I now prefer that these genes are biological storage media."
Chapter completed!