{"id":123,"date":"2025-01-31T17:26:00","date_gmt":"2025-01-31T09:26:00","guid":{"rendered":"https:\/\/blog.liu-qi.cn\/?p=123"},"modified":"2026-04-18T21:38:02","modified_gmt":"2026-04-18T13:38:02","slug":"%e6%88%91%e8%ae%a9ai%e5%bc%80%e5%8f%91%e4%ba%86%e4%b8%80%e4%b8%aa%e8%87%aa%e5%8a%a8%e5%88%86%e6%9e%90b%e7%ab%99%e5%bc%b9%e5%b9%95%e7%9a%84%e7%bd%91%e7%ab%99","status":"publish","type":"post","link":"https:\/\/en.blog.liu-qi.cn\/2025\/01\/31\/%e6%88%91%e8%ae%a9ai%e5%bc%80%e5%8f%91%e4%ba%86%e4%b8%80%e4%b8%aa%e8%87%aa%e5%8a%a8%e5%88%86%e6%9e%90b%e7%ab%99%e5%bc%b9%e5%b9%95%e7%9a%84%e7%bd%91%e7%ab%99\/","title":{"rendered":"I Used AI to Build an Automated Bilibili Bullet Comment Analysis Website"},"content":{"rendered":"<p>Actually, in the previous tweet<a href=\"https:\/\/blog.liu-qi.cn\/2025\/01\/27\/deepseek%E7%9A%84api%EF%BC%8C%E6%88%91%E4%BB%AC%E6%99%AE%E9%80%9A%E4%BA%BA%E9%83%BD%E8%83%BD%E7%94%A8%E5%9C%A8%E5%93%AA%EF%BC%9F\/\">Where can ordinary people like us use DeepSeek&#8217;s API?<\/a>It was mentioned before, but this time I&#8217;ll specifically bring it up, including development insights and correct usage methods.<\/p>\n<p>First, let me put the website here:<\/p>\n<p>https:\/\/danmu.liu-qi.cn\/<\/p>\n<p>Additionally, recently DeepSeek&#8217;s API has crashed badly, and even the open platform is inaccessible. So temporarily, I switched the AI analysis API to the newly released Doubao-1.5-pro-32k from Doubao, and the results are also good.<\/p>\n<p>Also, the AI deep analysis section requires uploading video subtitles, but probably many people don&#8217;t know how to obtain the subtitle file and thus fail to use it. We&#8217;ll explain this method later, including a complete tutorial on how to properly use the website.<\/p>\n<p>Then let&#8217;s continue.<\/p>\n<p>This little website was actually my first attempt at AI programming; I hadn&#8217;t used Cursor at the time and completed everything entirely within the chat window.<\/p>\n<p>Because I was just looking into Bilibili-related businesses then, and I knew that Bilibili&#8217;s danmaku (bullet comments) are easy to access with existing APIs. There were even local video players that could sync online danmaku while watching locally. So, I decided to create a danmaku analysis tool. Initially, I didn&#8217;t even have a clear idea, and just told the AI: I want to write a web program to fetch Bilibili video danmaku for analysis.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/001-74234ef8cbdb.png\" \/><\/p>\n<p>I even didn&#8217;t think through what could be analyzed, and I asked the AI about that too.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/002-22700a5596e1.png\" \/><\/p>\n<p>The AI itself called the API to fetch Bilibili danmaku. I provided the iFlytek Spark API documentation and my own key to the AI. And I had the AI teach me how to deploy it on my own cloud server using the BT Panel.<\/p>\n<p>So, the initial version looked like this:<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/003-c4549d383db8.jpg\" \/><\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/004-1e3c5a30791b.jpg\" \/><\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/005-db3c6646e1e7.jpg\" \/><\/p>\n<p>Most of the analysis was done based on the information obtained, such as danmaku color. Then, following its own logic, the AI displayed samples of high-energy moment danmaku and performed sentiment analysis on 10 typical danmaku.<\/p>\n<p>This was the first version; you can see many illogical issues, but it could run.<\/p>\n<p>After that, we could continue iterating. For example, below is one of the intermediate versions where I further adjusted the layout. I also felt that sampling only 10 danmaku for sentiment analysis was unreasonable, so I added three analysis modes: Analyze All, Analyze Peaks, and Analyze 50% Sample.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/006-8dca6fcb12e1.jpg\" \/><\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/007-cf498729a5e9.jpg\" \/><\/p>\n<p>At least it became much more reliable and well-structured than the initial version.<\/p>\n<p>So you see, sometimes chicken soup sayings aren&#8217;t wrong. Often, you just need to start doing it first, and once it&#8217;s done, you can gradually optimize.<\/p>\n<p>After N versions, it iterated to its current state.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/008-9f501f0fc1de.jpg\" \/><\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/009-f0fefc142914.jpg\" \/><\/p>\n<p>\u2b06\ufe0fThe target video analyzed above features a well-known food writer uploader (content creator) using common household tools like a flame thrower to teach everyone how to cook homestyle dishes XD<\/p>\n<p>At this point, the API was switched to DeepSeek. Back then, DeepSeek hadn&#8217;t been in conflict with the U.S. military, and its API status was completely different from now\u2014it was basically a speed version, fast and effective. (Currently, it has been temporarily switched to Doubao.)<\/p>\n<p>Here&#8217;s a full introduction to its features:<\/p>\n<p>First, the initial interface supports direct analysis using BV numbers, video links, and video links with parameters.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/010-9f4b0c87e664.png\" \/><\/p>\n<p>This step actually involves multiple conversions. Fetching XML danmaku (bullet comments) actually requires the video&#8217;s CID, so a conversion from BV number to CID is needed. Meanwhile, what we usually copy is the video link, so the ability to obtain the video&#8217;s BV number from the video link is also required.<\/p>\n<p>This process originally required some basic front-end programming skills to operate. But now, you just need to tell the AI that you want to fetch danmaku using the video link or BV number. If the AI you&#8217;re using isn&#8217;t that intelligent, provide it with a prompt indicating the need for link conversion, and send it examples of the link and BV number format, and it can generally complete the task.<\/p>\n<p>After entering the link and clicking &#8216;Analyze Danmaku&#8217;:<\/p>\n<p>The server will automatically fetch the XML danmaku, convert it into both CSV and TXT formats, and offer download options.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/011-e1dac17c3dc6.png\" \/><\/p>\n<p>The CSV format retains more information, including danmaku color, while the TXT format only preserves the timecode and danmaku content, making it convenient for analysis beyond the program&#8217;s fixed workflow.<\/p>\n<p>Note that due to issues with the danmaku API interface, only danmaku from T-1 (the previous day) can be fetched. Additionally, for videos with an excessively large volume of danmaku, the platform might officially clear some comments, which can lead to inaccuracies in the danmaku count and the distribution of sending dates shown below.<\/p>\n<p>At the same time, the basic information of the video will be automatically retrieved.<\/p>\n<p>The 2&#215;2 chart module below fixedly includes four charts: density distribution, word cloud, sending date distribution, and sending time distribution.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/012-45789d27ca71.png\" \/><\/p>\n<p>Danmaku density reflects the peak interaction points in the video.<\/p>\n<p>The danmaku word cloud reflects keyword frequency. Hovering the mouse over a corresponding keyword will display the frequency of that word&#8217;s occurrence.<\/p>\n<p>The publication date indirectly reflects changes in video popularity and long-tail traffic. The x-axis span is dynamically adjusted based on the video&#8217;s publication time: videos published within a month display dates, while those published over a month ago are aggregated by weeks. Through this, you can gain insights into time nodes where initially overlooked videos suddenly become popular, and it can also be used to track long-tail traffic for commercial collaboration uploader (Of course, this is provided the video isn&#8217;t too popular, otherwise it may trigger the inaccuracy mentioned earlier).<\/p>\n<p>The publication time distribution statistics represent the natural time (0:00-24:00) when the danmaku (bullet comments) were sent, which can to some extent reflect the active hours of viewers in this field or for this type of video. However, for popular uploader, it may be interfered with by the video&#8217;s publication time. Based on my observations, for uploader with a certain level of influence, the sending time of video danmaku tends to concentrate within a few hours after the video is published. If you need to conduct an objective analysis, I recommend you manually remove the danmaku from the first day of the video publication after downloading.<\/p>\n<p>Further down is the AI danmaku analysis module, which involves the AI interface.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/013-5f7c5f5a2950.png\" \/><\/p>\n<p>The AI danmaku analysis will reverse-engineer the video content based on the basic video information and the danmaku content, making some inferences and summaries. This is quite amazing. It should be noted that at this point, the AI has no information about the video content itself. Yet, relying solely on the basic video information and the netizens&#8217; danmaku, it can often deduce the video&#8217;s main theme with 80-90% accuracy and even precisely capture hidden sponsors within the video.<\/p>\n<p>Even when challenging it with videos like BV1jcqEY1EeZ, it doesn&#8217;t fail.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/014-81a5e2bf5b7b.png\" \/><\/p>\n<p>\u2b06\ufe0f This is a typical video that suddenly becomes popular midway, as mentioned earlier.<\/p>\n<p>The analysis conclusion is as follows:<\/p>\n<p>The video content may be filled with a large amount of incomprehensible garbled text, strange expressions, and canned laughter.<\/p>\n<p>Indeed, that&#8217;s the case.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/015-1df68d6841a3.png\" \/><\/p>\n<p>In this section, I have set a limit of 3000 danmaku for analysis. If the danmaku count is less than 3000, all danmaku will be analyzed. If the count exceeds 3000, a proportional sample from the beginning and peak danmaku will be taken, and the remaining danmaku will be evenly sampled to make up the 3000.<\/p>\n<p>(If the API costs are too high in the future, the upper limit might be reduced to 1000, XD.)<\/p>\n<p>Next, the in-depth analysis section, which is used to analyze the video rhythm and perform cross-analysis between subtitles and danmaku.<\/p>\n<p>Friends who often use Bilibili may have noticed that almost all new videos published on Bilibili now come with AI-generated subtitles. This is the foundation of this feature.<\/p>\n<p>Based on the information I found, there is no API for subtitle files; they can only be obtained through packet capture, which involves anti-scraping measures and is difficult to implement. However, through a browser extension, it is actually quite simple to obtain video subtitles. So, I just created a manual upload function.<\/p>\n<p>To obtain video subtitles, you can search for the &#8216;Bibi Jun&#8217; browser extension in the Chrome Web Store.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/016-88cc98c15146.png\" \/><\/p>\n<p>After installing this extension, a subtitle list module will appear above the danmaku list on the Bilibili video page (it can also be displayed in a sidebar). It will automatically retrieve the video subtitles, and you can manually copy or download the video subtitles.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/017-48554bf4534d.png\" \/><\/p>\n<p>By the way, this plugin also comes with an AI summarization feature, and you can also use DeepSeek&#8217;s API here.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/018-d174440e1843.png\" \/><\/p>\n<p>Now, if we want to perform a cross-analysis of a video&#8217;s subtitles and bullet comments, it&#8217;s best for the subtitles to include timecodes. So, choose the &#8220;List (with time)&#8221; format for download, which will give you a TXT subtitle file with timecodes.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/019-473b1c7524ce.png\" \/><\/p>\n<p>Then, you can upload this TXT file for in-depth AI analysis.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/020-21419acfdced.png\" \/><\/p>\n<p>When the AI obtains the video&#8217;s subtitles, it largely understands the video&#8217;s content.<\/p>\n<p>Thus, it can output the pacing of the entire storyline in the video:<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/021-103a7fa8c5a1.png\" \/><\/p>\n<p>Since my own work is related to marketing, I apologize for repeatedly highlighting the video&#8217;s sponsor and their product placement. I also analyzed the creative highlights of the video.<\/p>\n<p>Next comes the cross-analysis of subtitles and bullet comments. The AI will compare the bullet comment content during peak moments with the storylines reflected in the subtitles, offering insights into audience emotional reactions and points of interest, while also analyzing the video&#8217;s marketing strategy and communication value.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/022-8d414da184d9.png\" \/><\/p>\n<p>However, cross-analysis is quite demanding on the capabilities of large language models. The current performance of Doubao (the model being used) is not impressive; using Claude would likely yield much better results.<\/p>\n<p>That&#8217;s all the features and usage instructions for this website.<\/p>\n<p>I built this entire website using ChatBot, so after actually using Cursor, I can only say I regret not trying it sooner.<\/p>\n<p>I sincerely suggest friends in a similar situation prioritize tools like Cursor, Windsurf, or Cline when starting to use AI for coding\u2014they will save you a lot of effort.<\/p>\n<p>However, if you insist on doing this with ChatBot, whether because you&#8217;re not comfortable with an IDE, or simply lack the budget for a subscription or API, anyway\u2014if you&#8217;re determined to do it just like I did with this bullet comment analysis website, going back and forth with AI copying and pasting to get things done, then I have some experience to share:<\/p>\n<p>1. AI-generated code isn&#8217;t always correct, and it will inevitably involve numerous errors and fixes. Besides copying and pasting error messages, add your own description of the error, ideally including your speculation on its cause. This can save you many rounds of conversation. Also, if you can&#8217;t explain it clearly, sending screenshots is effective (if you&#8217;re interacting with a multimodal large model).<\/p>\n<p>2. If you find the conversation has gone off track, become increasingly messy, or turned into a dialogue of the deaf, don&#8217;t try to correct it through language further. Directly roll back to the last correct checkpoint, modify the subsequent message, and restart the conversation.<\/p>\n<p>3. If you&#8217;re like me and have no idea which function or route to modify when AI suggests changes, agree in advance on a method for indicating modifications.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/023-0531739dfcda.jpg\" \/><\/p>\n<p>4. For files with long code, have the AI add comments like Part1, Part2 within them. When the volume of modifications is large, directly ask it to send you the complete code for a specific section (since sending everything at once might get cut off and affect the conversation token limit), and then copy and paste the entire block.<\/p>\n<p>If you inspect the HTML code of the Bilibili Danmaku Analyzer webpage, you&#8217;ll notice comments like &#8216;Part 1&#8217; and &#8216;Part 2&#8217; inside, which were essentially used for this purpose.<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/024-72341f72800e.png\" \/><\/p>\n<p>It&#8217;s the same in the Python code:<\/p>\n<p><img decoding=\"async\" alt=\"\" loading=\"lazy\" src=\"https:\/\/blog.liu-qi.cn\/wp-content\/uploads\/2026\/04\/025-917e4527049a.png\" \/><\/p>\n<p>5. If you&#8217;re using Claude and want to copy and paste a long block of code in one go, you can have it output to an Artifact. Long code snippets will break in normal conversations, but in an Artifact, the second reply can continue the new code output from the previous Artifact. However, when making modifications, I still recommend dividing it into Part 1 and Part 2.<\/p>\n<p>6. Before starting to output code, it&#8217;s best to chat with the AI for a few rounds to help it design the architecture clearly. For example, the API interface for large models is best placed in a separate file for reference or using environment variables. But if you directly ask the AI to write it right away, it might end up hardcoded somewhere or even exposed in the frontend JavaScript. In that case, if one day you want to switch from DeepSeek&#8217;s API to Doubao, making changes would be very troublesome. With Cursor, it&#8217;s still manageable, but using ChatBot is particularly cumbersome.<\/p>\n<p>In short, it&#8217;s still recommended to use automation tools whenever possible instead of copying and pasting.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This article details the process of using AI to develop a website that automatically analyzes Bilibili bullet comments, including development insights and proper usage instructions.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24],"tags":[],"class_list":["post-123","post","type-post","status-publish","format-standard","hentry","category-articles"],"_links":{"self":[{"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/posts\/123","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/comments?post=123"}],"version-history":[{"count":0,"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/posts\/123\/revisions"}],"wp:attachment":[{"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/media?parent=123"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/categories?post=123"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/en.blog.liu-qi.cn\/index.php\/wp-json\/wp\/v2\/tags?post=123"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}