Behind the 4 hours of Spring Festival Gala, Jingdong Cloud staged 16 “Moving art”
Head image source: Sohu News – Pole body 19 days, from 0 to 1;Four hours, 16 limit micros.2022 Year of the Tiger Spring Festival Gala 69.1 billion red envelope interaction behind, jingdong cloud with strength “amazing” the industry.As the first zero-new server, jingdong Cloud independently supports the red envelope interaction of the Spring Festival Gala. Relying on the cloud ship, jingdong Cloud has realized the efficient and precise movement of the stock computing resources and successfully dealt with the world’s largest traffic peak roaring during the Spring Festival Gala.This is a first in the history of cloud computing, not only setting a new record for cloud native super-scale practices, but also setting a Chinese speed for digital new infrastructure.On January 5, JD.com was selected as the interactive partner of the red envelope for the 2022 CCTV Spring Festival Gala, but it faced the most difficult red envelope ever.Unlike previous counterparts who reserve enough servers to withstand the flood of traffic, the preparation period for the Spring Festival Gala of the Year of the Tiger is only 19 days, which is the shortest among CCTV Spring Festival Gala’s red envelope interactions in the past, which means that it is no longer feasible to respond with servers.Moreover, the interactive cycle of the Spring Festival Gala, which starts on Jan 24 and lasts until Feb 15, is the longest in history.In addition to the New Year’s Eve flood peak, Jingdong cloud also needs to resist the 23-day continuous pulse type flood peak, super long cycle of data center resources, system architecture stability, business system deployment capacity and so on have formed a great test.In terms of the volume of traffic, taking the data of 2021 as an example, the total number of red envelope interactions reached more than 70 billion times in the Spring Festival Gala that year, and the traffic is expected to increase further in the 2022 Spring Festival Gala interactive activities.Within a few seconds, tens of billions of people participate in the interactive click, will form hundreds of millions of levels of QPS traffic peak, so that the server processing data pressure surge.The user’s access behavior is also different from that of June 18 and November 11. Seven rounds of mouth broadcast in the Spring Festival Gala will bring seven rounds of peak access, posing great challenges to system stability and continuous power supply.Different from previous Spring Festival Gala, this year’s Spring Festival Gala is more of a “pulse type gala”. Although the whole link is not complicated, jingdong still retains the “Spring Festival delivery” distribution activity, which requires jingdong not to reduce the daily mall business model, and meanwhile needs to guarantee the resources and logistics of the Winter Olympics.The coordination of all links including front-end website, order, settlement, payment, search and recommendation, as well as back-end storage, distribution, customer service and after-sales, poses a higher challenge to the scheduling and distribution of underlying resources.Anxiety is everywhere and challenges are unprecedented. In the face of “knowing nothing” about the Year of the Tiger Spring Festival Gala, how should JINGdong Cloud deal with it?The magic weapon of Jingdong Cloud is jingdong cloud native infrastructure and hybrid cloud operating system Cloud Ship (JDOS).The unique advantage of the Cloud ship is that it can maximize the shielding of differences in underlying infrastructure and “move” all resources.All underlying resources, underlying service containers, and service services are deployed on the cloud ship to provide unified resource scheduling and guarantee.On the cloud ship, the Archimedes intelligent scheduling system is combined with monitoring data, and some application capacity and image information are used for targeted resource scheduling and allocation, so as to quickly realize the second-level scheduling switch according to the scene, and finally realize the staggered use of resources.Jingdong greetings native products research and development department, head of research and development of container Zhao Jianxing said “under the overall situation of the lack of resources, we for the entire business online with some level of adjustment and division, security, according to different business priorities and different business scenarios for different scheduling, to ensure that the full use of resources.”Take the Spring Festival Gala mouthplay as an example, it is the highest priority in the whole system, so these resources will be quickly expanded in the priority scheduling level. After the second level is suppressed, more resources will be moved out to expand the resources guaranteed by key businesses.After each oral broadcast, it switches back and forth between “red envelope mode” and “daily mode”. In the Mode of Spring Festival Gala, with the host’s 7 times of grabbing red envelope instructions, it realizes 16 times of precise resource shifting between jingdong transaction scene and Spring Festival Gala red envelope interaction scene.In addition, according to some of the highest priority business systems, jingdong cloud team will also carry out the whole resources run back and forth, for the first time the jingdong IDC and retail room jingdong mixed deployment of cloud, and combining the jingdong cloud some of internal resources, under the cloud and the cloud resources, fully guarantee the stability of the whole resources and the adequacy of resources.Let “elephant walking on a tightrope”, Jingdong cloud really did this time.According to jingdong, according to data released during the whole party, jingdong cloud cloudy mixed operating system cloud ship successful challenge of the world’s biggest flood peak flow, without any increase in computing resources, based on 70 data centers all over the country, and to the world’s leading nearly 3 million container, more than 10 million accounting force resources, through the second level speed scheduling resources,Successfully climbed the “Mount Qomolangma” in the field of cloud computing with super elasticity.On New Year’s Eve to participate in the frontline duty of JINGdong Cloud technical personnel to celebrate the successful completion of the task ‖ ten thousand people’s congresses 19 days of preparation, more than 100 virtual teams, ten thousand research and development collaboration, nearly 600 Spring Festival Gala project requirements, more than 3000 tasks……How should this large-scale coordinated operation land and carry out smoothly?In this process, JINGdong cloud research and development collaborative open platform “Xingyun” plays an indelible role.As an ecological chain of tools covering the whole life cycle of demand, development, testing, release, operation and maintenance, Xingyun has played an important role in coordination in the previous promotion activities and daily work, and has formed tacit cooperation and coordination standards in JINGdong.”In the Spring Festival Gala project, based on the low code platform of Xingyun, all the varied red envelope activities and marketing strategies of different scenes are built just like building blocks. Some product managers only need to drag and pull according to the demand to build floors and some activity elements.”He Yuzhi, the person in charge of the cloud platform, frankly said, “Product activities built, in this collaborative fast landing above the basic can be guaranteed.”With the support of Xingyun, 10,000 technical personnel across many places and departments quickly aligned the goals and plans, and cleared the problems and risks on a daily basis. Among them, the resource library reduction capacity was tracked in real time on an hourly basis to ensure that the daily reduction capacity was more than 100,000 loads.”One minute on the stage, ten years of work off the stage”, in order to ensure performance and stability, Jingdong cloud team through yuntai simulation exercise may occur all failures.From sudden accidents such as machine room power failure and downtime to various failures such as drill server crash and hard disk failure, we can take preventive measures in the process of revealing and discovering problems, and improve the preparation time.At the same time, in order to ensure the ultimate user experience, jingdong Cloud team conducted 7 rounds of full-link pressure tests before the Spring Festival Gala.”Based on actual production business scenarios and system environments, we simulate massive user requests and data, test and verify various scenarios, and constantly put pressure, tuning and iteration on the system.”In addition, Jingdong Cloud also pioneered the mode of “emergency script plan”, including the front desk, middle station, background and multi-module, involving CDN explosion, public network exit interruption and other core plans, among which the simple version of the emergency script plan has 61 pages, more than 20,000 words.As the senior director of THE R&D department of CLOUD Infrastructure of JINGdong and the person in charge of IDC basic support of the Spring Festival Gala project said, it is just like the “spell” of the statue of the fountain resurrected by hogwarts headmaster in the battle with Lord Voldemort in Harry Potter. Because of its great power, it can only be used once in a lifetime by one person.For JINGdong, every extreme operation and every rare step must be carried out once, so as to truly achieve a foolproof system under the impact of the flood peak of the Spring Festival Gala.Public data show that in 2020, in the global cloud computing IaaS market, JINGdong cloud IaaS market share ranks the fifth in China, ranking the top three in the growth rate of the top manufacturers, and climbing into the first echelon of domestic cloud computing.”The Gala is the best experience.”Steady said, “Before our cloud technology for the enterprise itself or other enterprises to provide services, will not experience such a large volume, but this is our retail business line to the cloud base and then to the underlying infrastructure, is the real test of our cloud technology in the past few years.”Jingdong Cloud with the strong technical strength of the new digital infrastructure, under the premise of zero increase in computing resources, successfully dealt with the world’s largest, most complex scene, the longest cycle, the shortest preparation time of the Year of the Tiger Spring Festival Gala red envelope interactive activities.Jingdong Cloud is setting sail after climbing to the top of cloud computing’s “Everest”.