TMTOWTDI: November 2024

Friday, 29 November 2024

Using grep --exclude/--include Syntax to Skip Certain Files

When you need to search for a specific string across files in a directory structure, but wish to exclude or include certain types of files (such as excluding binary files or including only certain file types), you can leverage grep's --exclude and --include options.

Scenario: Searching for `foo=` in Text Files While Excluding Binary Files

Consider the task of searching for foo= in text files but excluding binary files such as images (JPEG, PNG) to speed up the search and avoid irrelevant results. You can use the following command to achieve this:

Labels: Using grep --exclude/--include Syntax to Skip Certain Files

Thursday, 28 November 2024

How to Find the First Key in a Dictionary in Python

In Python, dictionaries store key-value pairs, and you may sometimes need to retrieve the first key. The approach depends on the Python version you are using. Here’s how you can do it efficiently.

Python Dictionary Basics

Consider the following dictionary for demonstration:

prices = {
    "banana": 4,
    "apple": 2,
    "orange": 1.5,
    "pear": 3
}

Labels: How to Find the First Key in a Dictionary in Python

Wednesday, 27 November 2024

How to Safely Encode URLs in JavaScript: A Guide to encodeURIComponent and Related Methods

When working with URLs in JavaScript, encoding is essential to ensure that data is transmitted safely and interpreted correctly by web servers. This post will guide you through the key methods for encoding URLs, their use cases, and best practices.

Why Encode URLs?

Encoding ensures that special characters in a URL—such as &, ?, or =—are not misinterpreted as part of the URL’s syntax. For example, when adding query parameters to a URL, failing to encode them may lead to unexpected behavior or errors.

Labels: How to Safely Encode URLs in JavaScript: A Guide to encodeURIComponent and Related Methods

Monday, 25 November 2024

How to Get a File’s Extension in PHP

When working with files in PHP, extracting the file extension is a common task. There are multiple ways to achieve this, ranging from basic string manipulation to built-in functions. Let’s dive into the best practices and alternatives for getting a file’s extension in PHP.

Using `pathinfo()`

PHP provides a built-in function, pathinfo(), that is both efficient and easy to use. It extracts various components of a file path, including the extension:

Labels: How to Get a File’s Extension in PHP

Sunday, 24 November 2024

Mastering String Repetition in Perl

In programming, you often need to repeat strings. For example, in Python, multiplying a string by a number repeats it:

print("4" * 4)
# Output: 4444

However, if you try a similar expression in Perl like print "4" * 4;, you get 16 instead of 4444. This is because Perl’s * operator is strictly for numerical multiplication. To repeat a string, Perl uses the x operator. Let’s explore how to achieve string repetition in Perl with some fresh examples to deepen your understanding.

Labels: Mastering String Repetition in Perl

Friday, 22 November 2024

Converting Arrays to ArrayLists in Java: A Comprehensive Guide

Converting an array to an ArrayList is a common task for Java developers, but it comes with some nuances that are important to understand. Here are several ways to achieve this, along with their benefits and potential drawbacks.

The quickest and simplest way to convert an array to a list is by using the Arrays.asList() method. For example:

Labels: Converting Arrays to ArrayLists in Java: A Comprehensive Guide

Thursday, 21 November 2024

Creating ETL Pipeline using Perl DBD::SQlite

Prerequisites

Perl: Ensure Perl is installed on your system.
CPAN Modules: Install the necessary CPAN modules:
- Text::CSV_XS for CSV handling.
- DBD::SQLite for SQLite database interaction.

You can install these modules using CPAN:

cpan Text::CSV_XS
cpan DBD::SQLite

ETL Pipeline Script

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS;
use DBI;

# Configuration
my $input_csv  = 'input_data.csv';
my $output_db  = 'output_data.db';
my $table_name = 'transformed_data';

# Step 1: Extract - Read data from CSV
sub extract_data {
    my $csv = Text::CSV_XS->new({ binary => 1, auto_diag => 1 });
    open my $fh, "<:encoding(utf8)", $input_csv or die "Cannot open $input_csv: $!";

    my @data;
    while (my $row = $csv->getline($fh)) {
        push @data, $row;
    }
    close $fh;

    return \@data;
}

# Step 2: Transform - Perform data transformation
sub transform_data {
    my ($data) = @_;
    my @transformed_data;

    foreach my $row (@$data) {
        # Example transformation: Convert all names to uppercase
        my ($id, $name, $age) = @$row;
        $name = uc($name);

        push @transformed_data, [$id, $name, $age];
    }

    return \@transformed_data;
}

# Step 3: Load - Insert transformed data into SQLite database
sub load_data {
    my ($data) = @_;

    # Connect to SQLite database
    my $dbh = DBI->connect("dbi:SQLite:dbname=$output_db", "", "", { RaiseError => 1, AutoCommit => 1 })
        or die "Could not connect to database: $DBI::errstr";

    # Create table if it doesn't exist
    $dbh->do("CREATE TABLE IF NOT EXISTS $table_name (id INTEGER PRIMARY KEY, name TEXT, age INTEGER)");

    # Prepare insert statement
    my $sth = $dbh->prepare("INSERT INTO $table_name (id, name, age) VALUES (?, ?, ?)");

    # Insert each row
    foreach my $row (@$data) {
        $sth->execute(@$row);
    }

    # Disconnect from database
    $dbh->disconnect();
}

# Main ETL Process
sub main {
    my $extracted_data = extract_data();
    my $transformed_data = transform_data($extracted_data);
    load_data($transformed_data);

    print "ETL process completed successfully!\n";
}

# Run the ETL pipeline
main();

Explanation

Extract:
- The extract_data function reads data from a CSV file using the Text::CSV_XS module.
- It reads each row and stores it in an array of arrays.
Transform:
- The transform_data function takes the extracted data and performs transformations.
- In this example, it converts all names to uppercase.
Load:
- The load_data function connects to an SQLite database using the DBD::SQLite module.
- It creates a table if it doesn’t exist and inserts the transformed data into the table.
Main:
- The main function orchestrates the ETL process by calling the extract, transform, and load functions in sequence.

Running the Script

Save the script to a file, e.g., etl_pipeline.pl.
Make the script executable:
```
chmod +x etl_pipeline.pl
```
Run the script:
```
./etl_pipeline.pl
```

Sample Input CSV (`input_data.csv`)

id,name,age
1,John Doe,30
2,Jane Smith,25
3,Alice Johnson,35

Expected Output in SQLite Database

After running the script, the output_data.db SQLite database will contain a table transformed_data with the following data:

id	name	age
1	JOHN DOE	30
2	JANE SMITH	25
3	ALICE JOHNSON	35

This Perl script provides a basic yet complete ETL pipeline that can be easily extended or modified to suit more complex scenarios. It demonstrates how to extract data from a CSV file, transform it, and load it into a SQLite database. This example should serve as a useful reference for developers looking to implement ETL processes in Perl.

Labels: Creating ETL Pipeline using Perl

Wednesday, 20 November 2024

Why Doesn’t Python Print Output in a Detached Docker Container?

When running a Python application in a detached Docker container (-d flag), it’s common to encounter issues where the application’s output doesn’t appear in the logs. This happens due to how Python handles buffering for standard output (stdout) and standard error (stderr). Let’s explore the problem and solutions.

The Problem: Buffered Output

By default, Python uses buffered output for stdout and stderr. This means data isn’t written to the output stream immediately but is instead stored in a buffer until it reaches a certain size or the program terminates. When a Docker container runs in detached mode, this buffered output might not appear in the logs promptly.

Labels: Why Doesn’t Python Print Output in a Detached Docker Container?

Tuesday, 19 November 2024

Merging JavaScript Objects: Techniques and Examples

Merging properties of JavaScript objects is a common task when dealing with dynamic data structures. In this post, we’ll explore different ways to merge objects in JavaScript, showcasing various examples and techniques suitable for flat objects without recursion or functions.

1. Using the Spread Operator

The spread operator (...) is a modern and concise way to merge objects introduced in ECMAScript 2018. It creates a new object containing the properties of both input objects. If there are duplicate keys, the values from the second object overwrite those in the first.

Labels: Merging JavaScript Objects: Techniques and Examples

Saturday, 16 November 2024

Safest Ways to Iterate Through Perl Hash Keys

When working with Perl hashes, choosing the right way to iterate through keys is essential for avoiding unexpected behavior and ensuring efficient memory usage. Here are the most common methods, their advantages, and the potential pitfalls associated with each.

Iterating with `each`

The each function retrieves one key-value pair at a time. This is memory-efficient and works well for large hashes or tied hashes.

Labels: Safest Ways to Iterate Through Perl Hash Keys

Wednesday, 13 November 2024

Troubleshooting Pyenv: How to Switch Python Versions Effectively

If you’re using pyenv to switch between Python versions but find that it’s not working as expected, you’re not alone. This is a common issue, especially for macOS and Linux users. In this guide, we’ll go through common problems with pyenv, and how to set it up correctly to ensure you can switch between Python versions seamlessly.

Why Pyenv Isn’t Switching Python Versions

When you try to change Python versions using pyenv, you might notice that even after switching, running python --version still points to the previous version. This typically happens due to shell configuration issues or missing environment variables that prevent pyenv from properly overriding the system’s default Python path.

Labels: Troubleshooting Pyenv: How to Switch Python Versions Effectively

Tuesday, 12 November 2024

How to Generate a GUID/UUID in JavaScript: Modern Solutions for Unique Identifiers

GUIDs (Globally Unique Identifiers), also called UUIDs (Universally Unique Identifiers), are a standard way to generate unique identifiers in software applications. In JavaScript, you can generate GUIDs to uniquely identify objects, store them in databases, or share them across different systems.

This post will explore methods for generating UUIDs, focusing on various approaches, especially ones that don’t rely on Math.random() due to its limitations in randomness and uniqueness.

What is a GUID/UUID?

A UUID is a 128-bit identifier formatted as "xxxxxxxx-xxxx-Mxxx-Nxxx-xxxxxxxxxxxx", where:

x represents a hexadecimal digit (0-9, a-f).
M indicates the UUID version (e.g., 4 for randomly generated UUIDs).
N is a variant, usually one of 8, 9, a, or b.

Labels: How to Generate a GUID/UUID in JavaScript: Modern Solutions for Unique Identifiers

Saturday, 9 November 2024

Easiest Ways to Install Missing Perl Modules

When you encounter an error like Can't locate Foo.pm in @INC, it means Perl couldn’t find a required module. Fortunately, there are various ways to install Perl modules, each suited to different needs and environments. Here’s a guide to some of the most convenient ways to install missing Perl modules.

1. Installing Modules with CPAN

CPAN (Comprehensive Perl Archive Network) is Perl’s standard tool for installing modules. Here’s how to use it from the command line.

Labels: Easiest Ways to Install Missing Perl Modules

Friday, 8 November 2024

Reading a File into a Java String: Modern Techniques

Java provides multiple ways to read the entire contents of a file into a single String. From traditional methods using BufferedReader to modern approaches with NIO utilities, there’s a technique to suit different Java versions and project requirements. Here’s a guide to some of the most popular methods, including options for Java 7 and newer, as well as some third-party alternatives.

1. Using `Files.readString()` in Java 11+

For Java 11 and later, the Files.readString() method offers a simple and efficient way to read the entire content of a file into a String. This method preserves line terminators.

Labels: Reading a File into a Java String: Modern Techniques

Monday, 4 November 2024

Changing an Element’s Class with JavaScript: Simple and Modern Techniques

In JavaScript, modifying the CSS classes of HTML elements allows for dynamic styling changes based on user interactions or other events. Changing classes is especially useful for toggling visibility, updating themes, or applying specific effects. Here’s a look at different ways to change an element’s class with JavaScript, from basic to more advanced techniques.

1. Directly Setting the `className` Property

The simplest way to change an element’s class is to use the className property. This approach overrides any existing classes, so it’s best used when you want to replace the current class entirely.

document.getElementById("myElement").className = "newClass";

This method sets the class to "newClass" on the specified element. For multiple classes, you can provide a space-separated list:

document.getElementById("myElement").className = "class1 class2";

Labels: Changing an Element’s Class with JavaScript: Simple and Modern Techniques

Saturday, 2 November 2024

UTF-8 All the Way Through: Ensuring Full UTF-8 Support in Your Web Application

Setting up full UTF-8 support in a web application is essential for handling multilingual content reliably. This guide covers all the key areas—MySQL, PHP, Apache, and HTML—to help you achieve a seamless UTF-8 experience across your stack. Here’s a checklist to ensure UTF-8 is correctly set up at every layer of your web application.

1. Configuring MySQL for UTF-8

To support a full range of Unicode characters, including emojis, configure MySQL to use utf8mb4 rather than utf8, as MySQL’s utf8 only supports up to three bytes (limited to basic multilingual characters).

Database and Table Configuration:

CREATE DATABASE your_database CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
ALTER TABLE your_table CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Column Configuration:
Set each text column to utf8mb4 to ensure character data is stored correctly:

ALTER TABLE your_table MODIFY column_name VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Connection Settings:
Set the character set for connections to utf8mb4. This way, data exchanged between MySQL and your application retains its UTF-8 encoding. Use the following configuration depending on your PHP extension:

Labels: UTF-8 All the Way Through: Ensuring Full UTF-8 Support in Your Web Application

Friday, 1 November 2024

Why Are Perl 5’s Function Prototypes Often Considered Bad?

In Perl, function prototypes allow you to define subroutines that resemble Perl’s built-in functions. While prototypes have their niche uses, many Perl developers advise caution—or even avoidance—when using them. This guide will walk through why Perl’s prototypes are often seen as problematic and discuss scenarios where they can still be useful.

What Are Perl Prototypes Supposed to Do?

Unlike prototypes in many other languages, Perl’s function prototypes don’t perform compile-time argument checking. Instead, their main purpose is to allow user-defined functions to mimic built-in functions in terms of syntax and behavior. This means that prototypes primarily:

Allow you to omit parentheses when calling the function.
Impose specific contexts (e.g., scalar or list) on the arguments.

Here’s a simple example:

sub mypush(\@@) {
    my ($array_ref, @items) = @_;
    push @$array_ref, @items;
}

my @array;
mypush @array, 1, 2, 3;  # Works without requiring parentheses

In this example, mypush is defined with a prototype (\@@), allowing it to be called like a built-in push, without explicitly referencing the array with \.

Labels: Why Are Perl 5’s Function Prototypes Often Considered Bad?

TMTOWTDI [There's More Than One Way To Do It]

Main Menu