怎么在Python2中获取中文文件名的编码 - 行业资讯 - 肥雀云

　　介绍

本篇文章为大家展示了怎么在Python2中获取中文文件名的编码,内容简明扼要并且容易理解,绝对能使你眼前一亮,通过这篇文章的详细介绍希望你能有所收获。

<强>问题:

Python2获取包含中文的文件名是如果不转码会出现乱码。

这里假设要测试的文件夹名为测试,文件夹下有5个文件名包含中文的文件分别为:

Python性能分析与优化. pdf

Python数据分析与挖掘实战. pdf

Python编程实战:运用设计模式,并发和程序库创建高质量程序. pdf

流畅的Python.pdf

编写高质量Python代码的59个有效方法。pdf

我们先不转码直接打印获取到的文件名,代码如下:

import 操作系统　　for file os.listdir拷贝(& # 39;。/测试# 39;): 　　,打印(文件)

<强>输出乱码:

Python ? ? ? ? ? ? ? ? ? ? ? ? . pdf 　　Python ? ? ? ? ? ? ? ? ? ? ? ? ? ? . pdf 　　Python ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? . pdf 　　? ? ? ? ? ? Python.pdf 　　Pythonд? ? ? ? ? ? ? ? ? ? ? 59 ? ? ? ?Ч? ? ? ?。pdf

<>强解决:

先测试一下文件名的编码,这里我们用到chardet模块,安装命令:

pip install chardet

用chardet.detect函数检测一下文件名的编码方式:

{& # 39;信心# 39;:,0.99,& # 39;编码# 39;:,& # 39;gb2312 # 39;} 　　{& # 39;信心# 39;:,0.99,& # 39;编码# 39;:,& # 39;gb2312 # 39;} 　　{& # 39;信心# 39;:,0.99,& # 39;编码# 39;:,& # 39;gb2312 # 39;} 　　{& # 39;信心# 39;:,0.73,& # 39;编码# 39;:,& # 39;windows - 1252 & # 39;} 　　{& # 39;信心# 39;:,0.99,& # 39;编码# 39;:,& # 39;GB2312 # 39;}

可以看出编码GB2312的置信度最大,下面我们用GB2312编码来解码文件名,代码如下:

import 操作系统　　import chardet 　　for file os.listdir拷贝(& # 39;。/测试# 39;):=,,r file.decode (& # 39; gb2312 # 39;) 　　,打印(r)

<强>输出:

Python性能分析与优化. pdf

Python数据分析与挖掘实战. pdf

Python编程实战:运用设计模式,并发和程序库创建高质量程序. pdf

流畅的Python.pdf

编写高质量Python代码的59个有效方法。pdf

经过编码之后,文件名打印正确。

<>强PS: chardet.detect检测的字符串越长越准确,越短越不准确

这里还有一个问题是上面的代码是在Windows下测试,Linux下文件名编码是utf - 8,为了兼容Windows和Linux,代码需要修改一下,下面我们把代码封装到函数中:

#, - *安康;编码:utf-8 - * - 　　import 操作系统　　　　def get_filename_from_dir (dir_path):=,file_list [] 　　,if not os.path.exists (dir_path): 　　return file_list才能　　,for item os.listdir拷贝(dir_path): 　　时间=basename 才能;os.path.basename(项) 　　#,才能打印(chardet.detect (basename)), #,找出文件名编码,文件名包含有中文　　#才能,windows下文件编码为GB2312, linux下为utf - 8 　　尝试才能: 　　,,decode_str =, basename.decode (“GB2312") 　　except 才能;UnicodeDecodeError: 　　,,decode_str =, basename.decode (“utf-8") 　　file_list.append才能(decode_str) 　　return file_list 　　#,测试代码　　时间=r get_filename_from_dir(& # 39;。/测试# 39;) 　　for 小姐:r拷贝: 　　,打印(i)