Wednesday, 22 April 2020

Python code to remove blank lines and duplicate whitespaces

This Python program is a simple yet powerful code snippet that removes blank lines and duplicate whitespaces from a given text file. The program reads a file named "filename" and removes any blank lines in the file. It also removes any duplicate whitespaces that may exist in the file. The final result is written back to the same file, overwriting the original contents. This program can be extremely useful for anyone working with text files, especially when it comes to cleaning up the contents of the file for further processing or analysis.

Method 1:

with open("filename", "r") as f:

lines = f.readlines()


new_lines = []

skip_next_line = False

for i, line in enumerate(lines):

    if skip_next_line:

        skip_next_line = False

        continue

    if line.strip() == "" and i > 0:

        prev_line = lines[i-1]

        if prev_line.strip() != "":

            skip_next_line = True

            continue

    new_lines.append(line)


with open("filename", "w") as f:

    f.writelines(new_lines)


Method 2:

with open("filename", "r") as f:

    lines = f.readlines()


new_lines = []

for line in lines:

    if line.strip():

        new_lines.append(" ".join(line.split()))


with open("filename", "w") as f:

    f.write("\n".join(new_lines))


Method 3:

import re

with open("filename", "r") as f:
    content = f.read()

content = re.sub(r'\n\s*\n', '\n', content)
content = re.sub(r' +', ' ', content)

with open("filename", "w") as f:
    f.write(content)


Labels: ,

0 Comments:

Post a Comment

Note: only a member of this blog may post a comment.

<< Home