Python使用filetype精确判断文件类型
filetype.py
SmallanddependencyfreePythonpackagetoinferfiletypeandMIMEtypecheckingthe magicnumberssignatureofafileorbuffer.
ThisisaPythonportfromfiletypeGopackage.WorksinPython +3.
一个小巧自由开放Python开发包,主要用来获得文件类型。包要求Python3.+
功能特色
•简单友好的API
•支持宽范围文件类型
•提供文件扩展名和MIME类型判断
•文件的MIME类型扩展新增
•通过文件(图像、视频、音频…)简单分析
•可插拔:添加新的自定义类型的匹配
•快,即使处理大文件
•只需要前261个字节表示的最大文件头,这样你就可以通过一个单字节
•依赖自由(只是Python代码,没有C的扩展,没有libmagic绑定)
•跨平台文件识别
安装
pipinstallfiletype
API
详情请查看annotatedAPIreference.
实例
简单的文件类型识别
importfiletype defmain(): kind=filetype.guess('tests/fixtures/sample.jpg') ifkindisNone: print('Cannotguessfiletype!') return print('Fileextension:%s'%kind.extension) print('FileMIMEtype:%s'%kind.mime) if__name__=='__main__': main()
支持类型
图片
•jpg – image/jpeg
•png – image/png
•gif – image/gif
•webp – image/webp
•cr2 – image/x-canon-cr2
•tif – image/tiff
•bmp – image/bmp
•jxr – image/vnd.ms-photo
•psd – image/vnd.adobe.photoshop
•ico – image/x-icon
视频
•mp4 – video/mp4
•m4v – video/x-m4v
•mkv – video/x-matroska
•webm – video/webm
•mov – video/quicktime
•avi – video/x-msvideo
•wmv – video/x-ms-wmv
•mpg – video/mpeg
•flv – video/x-flv
音频
•mid – audio/midi
•mp3 – audio/mpeg
•m4a – audio/m4a
•ogg – audio/ogg
•flac – audio/x-flac
•wav – audio/x-wav
•amr – audio/amr
资料库
•epub – application/epub+zip
•zip – application/zip
•tar – application/x-tar
•rar – application/x-rar-compressed
•gz – application/gzip
•bz2 – application/x-bzip2
•7z – application/x-7z-compressed
•xz – application/x-xz
•pdf – application/pdf
•exe – application/x-msdownload
•swf – application/x-shockwave-flash
•rtf – application/rtf
•eot – application/octet-stream
•ps – application/postscript
•sqlite – application/x-sqlite3
•nes – application/x-nintendo-nes-rom
•crx – application/x-google-chrome-extension
•cab – application/vnd.ms-cab-compressed
•deb – application/x-deb
•ar – application/x-unix-archive
•Z – application/x-compress
•lz – application/x-lzip
字体
•woff – application/font-woff
•woff2 – application/font-woff
•ttf – application/font-sfnt
•otf – application/font-sfnt
基准测试
使用链接中的文件进行测试,你可以点击获得到它:realfiles.
Environment:OSXx64i72.7Ghz
------------------------------------------------------------------------------------------benchmark:7tests------------------------------------------------------------------------------------------
Name(timeinns) Min Max Mean StdDev Median IQR Outliers(*) Rounds Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_infer_image_from_bytes 357.6279(1.0) 29,166.5395(1.0) 1,642.3360(1.0) 380.9934(1.0) 1,509.9843(1.0) 158.9457(1.0) 9095;13752 102301 6
test_infer_audio_from_bytes 953.6743(2.67) 96,082.6874(3.29) 16,534.5880(10.07) 3,002.1143(7.88) 15,974.0448(10.58) 953.6743(6.00) 4514;6051 41528 1
test_infer_video_from_bytes 13,828.2776(38.67) 272,989.2731(9.36) 16,151.3144(9.83) 3,361.2320(8.82) 15,020.3705(9.95) 953.6743(6.00) 2522;2887 22193 1
test_infer_image_from_disk 15,974.0448(44.67) 108,957.2906(3.74) 18,621.0844(11.34) 3,895.4441(10.22) 17,166.1377(11.37) 1,192.0929(7.50) 1528;1804 10206 1
test_infer_video_from_disk 23,841.8579(66.67) 229,120.2545(7.86) 28,691.3476(17.47) 6,242.9901(16.39) 25,987.6251(17.21) 4,053.1158(25.50) 1987;1247 15651 1
test_infer_zip_from_disk 26,941.2994(75.33) 230,073.9288(7.89) 32,123.3861(19.56) 7,524.4988(19.75) 29,087.0667(19.26) 4,768.3716(30.00) 1349;1292 16132 1
test_infer_tar_from_disk 33,855.4382(94.67) 164,031.9824(5.62) 36,884.4401(22.46) 4,489.4443(11.78) 36,001.2054(23.84) 953.6743(6.00) 1036;1828 14666 1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------