让我们将给定的文本文件命名为bar.txt
我们在python中使用文件处理方法来删除python文本文件或函数中的重复行。文本文件或函数必须与python程序文件位于同一目录中。以下代码是删除文本文件bar.txt中重复项并将输出存储在foo.txt中的一种方法。这些文件应与python脚本文件位于同一目录中,否则它将无法正常工作。
文件bar.txt如下
A cow is an animal. A cow is an animal. A buffalo too is an animal. Lion is the king of jungle.
下面的代码删除bar.txt中的重复行,并将其存储在foo.txt中
# This program opens file bar.txt and removes duplicate lines and writes the
# contents to foo.txt file.
lines_seen = set() # holds lines already seen
outfile = open('foo.txt', "w")
infile = open('bar.txt', "r")
print "The file bar.txt is as follows"
for line in infile:
print line
if line not in lines_seen: # not a duplicate
outfile.write(line)
lines_seen.add(line)
outfile.close()
print "The file foo.txt is as follows"
for line in open('foo.txt', "r"):
print line输出结果
文件foo.txt如下
A cow is an animal. A buffalo too is an animal. Lion is the king of jungle.