MP3下载器的设计与实现

本文ID：LWGSW66032

字数:13503,页数:36

价格:收费积分/100

本站会员可自行下载:

论文编号:TX254 论文字数:13503,页数:36 有开题报告，任务书，程序源码

摘要
搜索引擎，作为访问互联网的“网络门户”，是从www上快速而有效地获取信息资源的捷径。而网络爬虫作为搜索引擎的关键技术，它是一个自动提取，分析并过滤网页的程序,为搜索引擎从万维网上下载网页,是搜索引擎的重要组成。文件传输，作为网络应用中最主要的功能，也是互联网中资源共享的基础。下载工具也成为互联网中一种必不可少的工具。一些重要的协议像HTTP，FTP等都支持文件的传送，特别是基于P2P技术的，多任务，多线程，多源，断点续传的下载机制，极大的提高了网络资源的下载速度，最大化了网络资源的共享。
论文首先介绍了课题涉及到的主要理论和技术,在详细分析了爬虫技术的原理和文件下载机制的基础上，针对本课题的应用，改进了爬虫算法。根据所改进的爬虫算法设计并实现了一个MP3下载器，该MP3下载器主要由网络爬虫程序和文件下载2部分组成。网络爬虫实现了在互联网上抓取MP3格式的音乐资源的URL链接及相关信息（歌曲名,艺术家，专辑名等），并将信息以XML形式的数据格式保存在本地，为以后查询下载提供基础。实现了基于HTTP协议的文件下载，并提供了断点续传机制和多任务下载以及文件自动重命名功能。然后，对该MP3下载器进行了测试，测试结果表明，MP3下载器在爬虫抓取MP3信息以及MP3下载上均取得了预期的效果。
论文最后对全文进行了总结，并对今后工作作出了展望。

关键字：搜索引擎，网络爬虫，HTTP，P2P，断点续传
Design and Implement of MP3 Download
Abstract
Search engine, as a visit to the Internet "portal”, is a shortcut to rapid and effective access to the information resources from the www. Web crawler technology is the key to search engine, it is an automatic extraction, analysis and filtering website procedures for search engine downloaded the webpage from the World Wide Web. File transfer, as the most important network application functions, also is the basis of resources sharing on the Internet. Download tools has become an indispensable tool on the Internet. Some important protocols like HTTP, FTP and so on are major support as the supporting for the transmission of documents, particularly those based on P2P technology, multi-tasking, multi-threaded, multi-source and breakpoint continuingly download mechanism greatly improves the network download speed; maximize the sharing of network resources.
This paper first introduces the main theory and technology which related to the
Theme, analyzes the principles of the web crawler and the mechanisms for downloading in deeply, improving the web crawler algorithm to satisfy with the application. To design and implement of an MP3 download, according to the improved algorithm of the web crawler,. The Web crawler on the Internet crawls MP3 link resources and related information (title, artist, album, etc.), and also stored the information in the forms of XML in local file, providing a basis for future inquiries and downloading. Implementing a download based on HTTP protocol and providing a mechanism for breakpoint continuingly, multi-tasking download and automatic rename the downloaded file. Then, having a test for the MP3 download; it shows that it achieved expected results.
Finally, the researcher would show a review and outlook of the topics.

Key Words: Search engine, Web Crawler, HTTP, P2P, Breakpoint Continuingly

目录
1绪论 1
1.1 课题的背景和目的 1
1.2 国内外研究现状及趋势 1
1.2.1 搜索引擎 1
1.2.2 文件下载 2
1.3 课题研究的内容和意义 3
1.4 本文的结构 4
2 技术概述 5
2.1 正则匹配 5
2.2 XML 5
2.3 搜索引擎的原理 6
2.4 线程 7
2.4.1 线程 7
2.4.2 多线程 8
2.5 MP3标签信息 9
2.6 HTTP协议 9
2.7 PageRank算法 10
2.8 本章小结 11
3 系统的设计与实现 12
3.1 系统流程图 12
3.2 MP3爬虫算法 13
3.2.1 广度优先遍历策略 13
3.2.2 基于本课题的爬虫算法改进 14
3.2.3 解析HTML 15
3.3 MP3标签 15
3.3.1 MP3标签提取 15
3.3.2 MP3标签存储 17
3.4 文件下载 17
3.4.1 断点续传 17
3.4.2 批量下载 18
3.4.3 文件重命名 20
3.4.4 下载速度，进度，剩余下载时间的计算 21
3.5 .ini配置文件 22
3.6 delegate 和event自定义事件 22
3.7 本章小结 23
4 试验结果分析 24
4.1 网络爬虫 24
4.2 查询 25
4.3 文件下载 25
4.4 结果分析 26
4.5 本章小结 27
5 总结和展望 28
5.1 总结 28
5.2 展望 28
致谢 30
参考文献 31

相关论文

本论文在电子通信论文栏目，由论文格式网整理,转载请注明来源www.lwgsw.com,更多论文,请点论文格式范文查看

上一篇：数字语音教室中远程控制的设计与..

下一篇：P2P网络中的匿名通信算法研究

Tags：

【收藏】【返回顶部】

会计论文	电子机电论文
金融论文	电气自动化论文
模具设计	化学工程与工艺
机械设计	电子通信论文
英语论文	行政管理论文
物流论文	电子商务论文
法律论文	国际贸易论文
财务管理论文	人力资源论文
市场营销论文	土木工程论文
工商管理论文	工程管理论文
汉语言文学论文	教育管理论文
测控专业论文	交通工程论文
旅游管理论文	新闻专业论文
艺术设计	教育技术学论文
应用物理学论文	轻化工程论文
德语专业论文	给水排水工程
服装设计与工程	食品生物技术
材料科学与工程	电视制片管理
工业工程论文	文化产业管理
包装工程论文	印刷工程论文
信息管理论文	制药工程论文
生物工程论文	电子信息工程
信息计算科学	电气工程论文
通信工程论文	财务会计毕业论文
电子商务毕业论文	现代教育技术
信息管理专业	心理学专业
数学与应用数学	数学教育
护理学毕业论文	其他专业论文
历史学论文	学前教育毕业论文
小学教育毕业论文	教育管理毕业论文
法律专业毕业论文	汉语言文学毕业论文
工商管理毕业论文	人力资源毕业论文
营销专业毕业论文	物流专业毕业论文
计算机论文