Thursday, 30 January 2025

Building an ETL Pipeline with Perl and Amazon Redshift

Creating an ETL pipeline that interacts with a data warehouse (e.g., Amazon Redshift, Google BigQuery, Snowflake, etc.) is a common use case in modern data engineering. In this blog post, we’ll walk through building an ETL pipeline in Perl that extracts data from a data warehouse, transforms it, and loads it into another data warehouse or database. For this example, we’ll use Amazon Redshift as the data warehouse.

Overview

This ETL pipeline will:

  1. Extract: Fetch data from an Amazon Redshift data warehouse.
  2. Transform: Perform transformations on the data (e.g., cleaning, aggregations, or calculations).
  3. Load: Insert the transformed data into another Amazon Redshift table or a different data warehouse.
Read more »

Labels:

Tuesday, 28 January 2025

Using Amazon Athena with Perl

Amazon Athena is a powerful serverless query service that allows you to analyze data directly from Amazon S3 using standard SQL. While Athena is typically used with Python, Java, or other popular languages, Perl developers can also leverage its capabilities using the AWS SDK for Perl (Paws) or direct HTTP requests.

In this guide, we’ll explore how to use Amazon Athena with Perl, covering everything from basic setup to advanced use cases. Whether you’re a seasoned Perl developer or just getting started with AWS, this post will serve as a detailed reference for integrating Athena into your Perl applications.

Read more »

Labels:

Monday, 27 January 2025

Perl Built-ins Quick Reference

1. String Functions

length

Returns the length of a string.

my $str = "Hello, World!";
my $len = length($str);
print "Length: $len\n";  # Output: Length: 13

substr

Extracts a substring from a string.

my $str = "Hello, World!";
my $sub = substr($str, 0, 5);
print "Substring: $sub\n";  # Output: Substring: Hello

index

Returns the position of a substring within a string.

my $str = "Hello, World!";
my $pos = index($str, "World");
print "Position: $pos\n";  # Output: Position: 7

rindex

Returns the last position of a substring within a string.

my $str = "Hello, World! World!";
my $pos = rindex($str, "World");
print "Last Position: $pos\n";  # Output: Last Position: 14

uc

Converts a string to uppercase.

my $str = "Hello, World!";
my $uc_str = uc($str);
print "Uppercase: $uc_str\n";  # Output: Uppercase: HELLO, WORLD!

lc

Converts a string to lowercase.

my $str = "Hello, World!";
my $lc_str = lc($str);
print "Lowercase: $lc_str\n";  # Output: Lowercase: hello, world!

ucfirst

Converts the first character of a string to uppercase.

my $str = "hello, world!";
my $ucfirst_str = ucfirst($str);
print "Ucfirst: $ucfirst_str\n";  # Output: Ucfirst: Hello, world!
Read more »

Labels:

Saturday, 25 January 2025

Understanding the “AttributeError: ‘dict’ object has no attribute ‘iteritems’” in Python 3

If you are working with Python 3 and encounter an error like this:

AttributeError: 'dict' object has no attribute 'iteritems'

You’re not alone! This issue frequently appears when code that was originally written for Python 2 is run in Python 3. It often happens when dealing with dictionary methods like iteritems().

Read more »

Labels:

Wednesday, 22 January 2025

Graphical Diff Tools for Linux: Exploring the Best Options for Code Comparison

When working on a Linux environment, comparing files and directories efficiently is essential for developers and system administrators. Many users familiar with Araxis Merge or BeyondCompare on Windows may wonder if there are Linux alternatives with similar functionality. This guide introduces some of the best graphical diff tools available for Linux, highlighting unique features, performance considerations, and ideal use cases for each tool.

Read more »

Labels:

Sunday, 19 January 2025

Performing Perl Substitution Without Modifying the Original String

Introduction

When working with strings in Perl, a frequent task is to perform substitutions using regular expressions. However, there are times when you want to keep the original string intact while storing the modified version in a new variable. In this blog post, we’ll explore various ways to achieve this in Perl, ensuring that your original string remains unchanged while you work with its modified copy.

The Traditional Method: Copy and Modify

The most straightforward approach to perform a substitution while preserving the original string is to first copy the string to a new variable and then apply the substitution to the copy.

Read more »

Labels:

Tuesday, 14 January 2025

Embracing Microservices with Django for Modern Web Applications


Transitioning from a monolithic architecture to a microservices architecture involves several considerations, especially when incorporating technologies like Django, React, Angular, and potentially other back-end technologies like GoLang, FastAPI, or Java Spring. This post explores a practical approach to building a microservices-based system with Django and how to structure such an architecture effectively.

Read more »

Labels:

Thursday, 9 January 2025

How to Check if an Array Includes a Value in JavaScript: The Modern Way


When working with JavaScript arrays, one of the most common tasks is to check whether a certain value exists within an array. Historically, this required writing loops or using methods that were not always intuitive. However, modern JavaScript provides more concise and readable solutions. Let’s explore the best methods for checking if an array contains a value and why you should use them.

1. The Modern Way: Array.prototype.includes()

Starting with ECMAScript 2016 (ES7), JavaScript introduced the includes() method, which is the easiest and most efficient way to check if an array contains a value. It is widely supported by all modern browsers, except for older versions of Internet Explorer.

Read more »

Labels:

Monday, 6 January 2025

How to Quickly Create a Large File on Linux

Creating large files efficiently on Linux can be crucial, especially for testing purposes like simulating disk usage or creating virtual machine (VM) images. While tools like dd are commonly used, they can be slow. Here’s a breakdown of faster methods to create large files, avoiding slow disk writes while ensuring the file is allocated on the disk.

1. Using fallocate (Best Choice for Most Cases)

The fallocate command is the fastest way to create large files, as it allocates the required disk space without initializing or writing data. This method ensures that the entire space is reserved without wasting time.

Read more »

Labels:

Saturday, 4 January 2025

How to Get File Creation and Modification Dates/Times in Shell/Bash

 When working with files in a shell or bash environment, it’s often useful to retrieve metadata such as file creation and modification dates/times. Below are several methods to achieve this across different platforms like Linux and Windows.

1. Modification Date/Time

Retrieving the modification date and time of a file is straightforward and works across both Linux and Windows platforms.

Read more »

Labels:

Wednesday, 1 January 2025

When to Use Classes in Python: A Guide for Data and Automation Developers

If you’ve been working with Python primarily for data processing, automation scripts, or small web applications, you may wonder when, if ever, you should use classes. Python supports multiple programming paradigms, and while classes are central to Object-Oriented Programming (OOP), not every Python script needs them. Here’s a guide on when classes can be useful in Python, especially for tasks involving automation, data handling, and small web projects.

Why Use Classes?

Classes provide a way to organize code that models complex data and behavior. They can make your scripts more modular, maintainable, and reusable. Here are some scenarios where classes might improve your code:

Read more »

Labels: