Skip to main content

11 posts tagged with "PostgreSQL"

View All Tags

Securing Your Express REST API with Passport.js

· 6 min read
Huseyin BABAL
Software Developer

As web applications grow, secure authentication becomes essential to protect sensitive data and prevent unauthorized access. In this article, we’ll explore how to secure a Node.js API using Passport.js and JSON Web Tokens (JWT) for stateless, token-based authentication. We’ll also use PostgreSQL for persistent storage of user data, providing a robust, scalable setup ideal for modern applications.

Why Use Passport.js and JWT for Node.js?

Passport.js is a powerful, flexible middleware for handling authentication in Node.js applications. When paired with JWTs, it enables scalable, stateless authentication without the need to manage session data. JWTs are particularly useful in mobile-friendly applications where maintaining server-side sessions is impractical.

Advantages

Using Passport.js, PostgreSQL and JWTs offers several key benefits:

  • Passport.js simplifies integration of various authentication strategies.
  • JWTs allow for stateless authentication, meaning no session management overhead.
  • PostgreSQL offers a reliable, ACID-compliant database for securely storing user credentials.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

Step-by-Step Guide to Securing Your Express API

Project Initialization and Dependencies

Before diving into the code, ensure you have:

  • Node.js and npm installed.
  • PostgreSQL database ready for storing user data.
  • Basic understanding of JavaScript and Node.js.

To start, initialize a new Node.js project and install required dependencies:

mkdir express-rest-api-jwt && cd express-rest-api-jwt
npm init -y
npm install express passport passport-jwt jsonwebtoken pg bcrypt dotenv

We’re using:

  • express: Web framework for Node.js.
  • passport and passport-jwt: Middleware and JWT strategy for authentication.
  • jsonwebtoken: For generating and verifying JWTs.
  • pg: PostgreSQL client for Node.js.
  • bcrypt: For securely hashing passwords.
  • dotenv: For environment variable management.

Configuring PostgreSQL for User Data

Set up a PostgreSQL database to store user information. Connect to your PostgreSQL instance and create a new database and table. If you are using Rapidapp, the database is already created there.

CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL
);

In this setup, the users table stores a unique email and a hashed password. Next, create a .env file to manage sensitive configuration:

DATABASE_URL=postgresql://username:password@host:5432/secure_app
JWT_SECRET=your_jwt_secret_key

Remember to replace username, password, and other values with your actual credentials.

Building the User Authentication Logic

Password Hashing

To securely store user passwords, use bcrypt to hash them before saving to the database. This prevents storing plaintext passwords.

server.js
const bcrypt = require('bcrypt');
const saltRounds = 10;

// Hashing function
async function hashPassword(password) {
return await bcrypt.hash(password, saltRounds);
}

User Registration Endpoint

Create a registration endpoint to handle new user signups. Hash the user’s password and save it in the database.

server.js
const express = require('express');
const app = express();
const { Pool } = require('pg');
const pool = new Pool({ connectionString: process.env.DATABASE_URL });

app.use(express.json());

app.post('/register', async (req, res) => {
const { email, password } = req.body;
const passwordHash = await hashPassword(password);

try {
await pool.query('INSERT INTO users (email, password_hash) VALUES ($1, $2)', [email, passwordHash]);
res.status(201).json({ message: 'User registered successfully' });
} catch (error) {
res.status(500).json({ error: 'User registration failed' });
}
});

User Login and JWT Generation

Create a login endpoint to validate credentials. If valid, generate a JWT for the user.

server.js
const jwt = require('jsonwebtoken');

app.post('/login', async (req, res) => {
const { email, password } = req.body;

const result = await pool.query('SELECT * FROM users WHERE email = $1', [email]);
const user = result.rows[0];

if (user && await bcrypt.compare(password, user.password_hash)) {
const token = jwt.sign({ id: user.id, email: user.email }, process.env.JWT_SECRET, { expiresIn: '1h' });
res.json({ token });
} else {
res.status(401).json({ error: 'Invalid credentials' });
}
});

Implementing Passport.js with JWT Strategy

Configure Passport.js to use the JWT strategy. Create a Passport configuration file and define the JWT strategy using passport-jwt.

server.js
const passport = require('passport');
const { Strategy, ExtractJwt } = require('passport-jwt');

passport.use(new Strategy({
jwtFromRequest: ExtractJwt.fromAuthHeaderAsBearerToken(),
secretOrKey: process.env.JWT_SECRET
}, async (jwtPayload, done) => {
const result = await pool.query('SELECT * FROM users WHERE id = $1', [jwtPayload.id]);
const user = result.rows[0];
return user ? done(null, user) : done(null, false);
}));

app.use(passport.initialize());

With this configuration, Passport extracts the JWT from the Authorization header and verifies it using our secret key.

Creating Protected Routes

Now that Passport is set up, you can protect specific routes by requiring authentication. Passport will verify the JWT before allowing access to these routes.

server.js
app.get('/profile', passport.authenticate('jwt', { session: false }), (req, res) => {
res.json({ message: `Welcome ${req.user.email}` });
});

In this example, the /profile route requires a valid JWT. If authentication succeeds, the request proceeds; otherwise, it’s rejected.

Serving the API

Finally, start the Express server to serve the API:

server.js
const PORT = process.env.PORT || 3000;

app.listen(PORT, () => {
console.log(`Server is running on http://localhost:${PORT}`);
});

Testing the Authentication Workflow

Register a New User

To create a new user, make a POST request to the /register endpoint with the user’s email and password.

curl -X POST http://localhost:3000/register \
-H "Content-Type: application/json" \
-d '{"email": "[email protected]", "password": "securepassword"}'

If successful, this will return:

{
"message": "User registered successfully"
}

Login with the Registered User

To log in, make a POST request to the /login endpoint with the same email and password. This will return a JWT if the credentials are correct.

curl -X POST http://localhost:3000/login \
-H "Content-Type: application/json" \
-d '{"email": "[email protected]", "password": "securepassword"}'

If successful, you’ll receive a response similar to this:

{
"token": "your_jwt_token_here"
}

Access the Protected Profile Endpoint

To access the protected /profile endpoint, you’ll need to include the JWT in the Authorization header as a Bearer token.

Replace your_jwt_token_here with the actual token you received from the login step:

curl -X GET http://localhost:3000/profile \
-H "Authorization: Bearer your_jwt_token_here"

If the JWT is valid, you should receive a response like:

{
"message": "Welcome [email protected]"
}

If the token is missing or invalid, you’ll likely get a 401 Unauthorized response:

{
"error": "Unauthorized"
}

Conclusion

Using Passport.js and JWT for authentication in a Node.js application provides a secure, stateless setup ideal for scaling. Combined with PostgreSQL, this setup efficiently handles user management while maintaining security best practices. With these foundations, you’re well-equipped to build secure, scalable applications.

tip

You can find the complete source code for this project on GitHub.

Database Change Management with Spring Boot Liquibase and PostgreSQL

· 8 min read
Huseyin BABAL
Software Developer

Introduction: Why Database Migrations Matter?

Database migrations are critical for maintaining a consistent and reliable schema across environments in modern application development. If you’re working with Spring Boot and PostgreSQL, Liquibase offers a powerful, flexible solution for managing database changes. In this guide, we’ll explore setting up Liquibase with Spring Boot and PostgreSQL to simplify schema updates, version control, and rollbacks.

Why Use Liquibase?

Liquibase is a widely used open-source tool that provides a structured way to track, manage, and apply database schema changes. It offers a variety of features to support developers in complex environments:

  • Changelog Files: Liquibase organizes migrations into changelog files that support XML, YAML, JSON, or SQL formats.
  • Rollback Support: Liquibase allows you to define rollback logic, making it easy to revert changes if needed.
  • Database Independence: With support for over 30 databases, Liquibase offers flexibility across projects. By integrating Liquibase with Spring Boot, you can automate migrations, manage schema versioning, and improve collaboration across teams.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

Step-by-Step Guide to Use Liquibase Migrations

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build a todo application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
--dependencies=liquibase,data-jpa,postgresql \
--type=maven-project \
--javaVersion=21 \
liquibase-migrations-demo

Line 2: liquibase for Liquibase integration, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Now that we initialized the project, go to the folder liquibase-migrations-demo and open it with your favourite IDE.

Application Configuration

Next, configure Liquibase in your application.properties file. At a minimum, you need to specify the database connection details:

application.properties showLineNumbers
spring.datasource.url=jdbc:postgresql://<host>:<port>/<db>
spring.datasource.username=<user>
spring.datasource.password=<password>

spring.liquibase.enabled=true
spring.liquibase.change-log=classpath:db/changelog/db.changelog-master.yaml

The spring.liquibase.change-logs property specifies the location where Liquibase will look for changelogs which contains actual schema changes.

Create Your First Changelog File

Liquibase migrations are structured into changelog files that record each schema change as an individual "changeSet."

  1. In src/main/resources/db/changelog/, create a new file named db.changelog-master.yaml. This file will be the main changelog file that references individual changes.
databaseChangeLog:
- include:
file: classpath:/db/changelog/changes/01-create-accounts-table.sql
  1. Now, create the referenced file 01-create-accounts-table.sql to define your first changeSet:
01-create-accounts-table.sql
CREATE TABLE accounts (
user_id SERIAL PRIMARY KEY,
username VARCHAR (50) UNIQUE NOT NULL,
email VARCHAR (50) UNIQUE NOT NULL,
password VARCHAR (50) NOT NULL,
created_at TIMESTAMP NOT NULL,
last_login TIMESTAMP
);

When it comes to naming the migration files, it's a good practice to follow a consistent naming convention. A common approach is to prefix the file with a version number and a brief description, like 01-create-accounts-table.sql.

  • Version number: There is no convention for the version number, but it helps on ordering the migration files.
  • Description: Provide a brief description of the migration.

SQL File Features in Liquibase

When using SQL, Liquibase offers some additional features within comments, such as:

  • Changesets: Define each migration step using --changeset author:id in comments. This keeps Liquibase’s tracking in place, similar to other formats.
  • Rollback Commands: Optionally add rollback commands with --rollback <SQL command>, allowing automatic rollbacks when needed. You can see a simple example below;
--liquibase formatted sql
--changeset huseyinbabal:1
CREATE TABLE accounts (
user_id SERIAL PRIMARY KEY,
username VARCHAR (50) UNIQUE NOT NULL,
email VARCHAR (50) UNIQUE NOT NULL,
password VARCHAR (50) NOT NULL,
created_at TIMESTAMP NOT NULL,
last_login TIMESTAMP
);

--rollback DROP TABLE accounts;

Running Application

When you run your Spring Boot application, Liquibase will automatically detect and apply any pending migrations to your PostgreSQL database. You’ll see output in the console indicating that the migrations have been successfully applied. You can run application as follows.

./mvnw spring-boot:run

Rollback

If you need to rollback a migration, you can use the rollback command. For example, to rollback the last changeSet, you can run:

./mvnw liquibase:rollback \
-Dliquibase.rollbackCount=1 \
-Dliquibase.url="jdbc:postgresql://<host>:<port>/<db>" \
-Dliquibase.username="<username>" \
-Dliquibase.password="<password>" \
-Dliquibase.changeLogFile=src/main/resources/db/changelog/db.changelog-master.yaml \
-Dliquibase.driver=org.postgresql.Driver

This time we used liquibase:rollback command to rollback the last changeSet. You can also rollback to a specific changeSet by providing liquibase.rollbackTag parameter. For more information, you can check the official documentation.

Best Practices for Migration Files

To ensure smooth database migrations, follow these best practices:

  • Organize Changelogs: Use a master changelog file to keep a well-organized list of all changes.

  • Use Descriptive Names: Name changelog files descriptively to make it clear what each migration does.

  • Version Control Changelogs: Commit changelog files to version control to ensure that schema changes are applied consistently across environments.

Understanding the Liquibase Metadata Table

When you run migrations with Liquibase, it automatically creates two metadata tables in your database: DATABASECHANGELOG and DATABASECHANGELOGLOCK. These tables keep track of migration status and control, ensuring changes are applied in a consistent and non-conflicting manner. Let’s look at each table in detail.

DATABASECHANGELOG Table

The DATABASECHANGELOG table records every changeSet that has been successfully applied to the database, allowing Liquibase to know which changes have already been executed. Here’s an overview of its key columns:

  • ID: A unique identifier for each changeSet, as specified in the changelog file. This ID is combined with the AUTHOR and FILENAME columns to uniquely identify a changeSet.
  • AUTHOR: The author of the changeSet, which is specified in the changelog file.
  • FILENAME: The file path of the changelog file containing the changeSet. This helps differentiate between changeSets if multiple changelog files are used.
  • DATEEXECUTED: The timestamp indicating when the changeSet was applied. This helps you track when each change was made.
  • ORDEREXECUTED: A number indicating the order in which the changeSets were applied, starting from 1. This can help troubleshoot issues by showing the order of migrations.
  • EXECTYPE: Specifies how the changeSet was executed. Common values include EXECUTED (successful), RERAN, and FAILED, which indicate the status of each migration.
  • MD5SUM: A hash value representing the changeSet’s content. Liquibase uses this hash to detect if a changeSet has been modified after it was initially applied.
  • DESCRIPTION: A brief description of the change, based on the change type (e.g., createTable, addColumn).
  • COMMENTS: Any comments associated with the changeSet, as specified in the changelog.
  • TAG: Optional; a tag that can be assigned to a particular changeSet, often used to mark significant points in the schema history (e.g., a major release).
  • LIQUIBASE: The version of Liquibase used to apply the changeSet. This can be useful for tracking compatibility.
  • CONTEXT: An optional field that can be used to specify a context for the changeSet. This can help control which changeSets are applied based on the context.
  • LABELS: Optional; a comma-separated list of labels that can be assigned to a changeSet. Labels can be used to group changeSets based on common characteristics.
  • DEPLOYMENT_ID: An identifier that can be used to track deployments. This can be useful for auditing and tracking changes across environments. The DATABASECHANGELOG table essentially serves as a ledger for Liquibase, ensuring it applies only new changeSets, avoiding duplications, and maintaining the history of changes.

DATABASECHANGELOGLOCK Table

The DATABASECHANGELOGLOCK table controls the concurrency of migrations to prevent multiple processes from applying changes at the same time, which can lead to conflicts or corruption. It has fewer columns but plays a critical role in database integrity.

  • ID: A unique identifier for the lock. This is generally 1 because only one lock record is required.
  • LOCKED: A boolean value indicating if a Liquibase process currently holds the lock. When set to TRUE, it prevents other instances from running migrations simultaneously.
  • LOCKGRANTED: The timestamp when the lock was acquired. This can be useful for diagnosing if a lock has been held too long (e.g., if a migration process was interrupted).
  • LOCKEDBY: Information on the host or process that acquired the lock, often containing the hostname and IP address. This can help identify which process holds the lock if issues arise.

The DATABASECHANGELOGLOCK table ensures that only one Liquibase instance modifies the schema at a time, which is particularly useful in distributed systems or CI/CD environments where migrations might be triggered concurrently.

How These Tables Support Safe and Consistent Migrations

  • The DATABASECHANGELOG table ensures migrations run in a consistent sequence and that changes are not re-applied or skipped.
  • The DATABASECHANGELOGLOCK table prevents race conditions, helping avoid potential conflicts from simultaneous migrations. Understanding these tables can help you troubleshoot migration issues and ensure Liquibase runs smoothly. Both tables are essential for Liquibase’s reliability and ability to manage complex database schemas effectively.

Conclusion

By adopting Liquibase in your Spring Boot project, you can efficiently manage database changes, reduce deployment friction, and allow your team to focus more on innovation. Whether you’re deploying new features or managing complex database versions, Liquibase streamlines the process, making database management simpler, safer, and more collaborative.

Building Simple Todo REST API with PHP, Laravel, and PostgreSQL

· 4 min read
Huseyin BABAL
Software Developer

Introduction

In this tutorial, we’ll walk through building a basic Todo REST API using Laravel and PostgreSQL. Whether you’re looking to enhance your understanding of Laravel, or you're interested in integrating a PostgreSQL database into your project, this guide will provide you with the necessary steps to get up and running quickly.

Prerequisites

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

Step-by-Step Implementation

Setting Up Your Laravel Project

First, let's create a new Laravel project. You can do this using Composer:

composer create-project --prefer-dist laravel/laravel laravel-todo-api

Next, navigate to the project directory:

cd laravel-todo-api

Configuring PostgreSQL Database

Open the .env file in the root directory of your Laravel project and update the database configuration to use PostgreSQL:

DB_CONNECTION=pgsql
DB_HOST=<host>
DB_PORT=<port>
DB_DATABASE=<database>
DB_USERNAME=<user>
DB_PASSWORD=<password>

Creating the Task Model and Migration

Run the following command to generate a model and migration for your tasks:

php artisan make:model Task -m

This command creates a Task model and a migration file located in the database/migrations directory.

Defining the Database Schema

Open the generated migration file in database/migrations and define the schema for the tasks table:

public function up()
{
Schema::create('tasks', function (Blueprint $table) {
$table->id();
$table->string('title');
$table->text('description')->nullable();
$table->boolean('completed')->default(false);
$table->timestamps();
});
}

Run the migration to create the table:

php artisan migrate

Implementing the Task Model

To allow mass assignment, update the Task model (app/Models/Task.php) by adding the title, description, and completed attributes to the fillable array:

namespace App\Models;

use Illuminate\Database\Eloquent\Factories\HasFactory;
use Illuminate\Database\Eloquent\Model;

class Task extends Model
{
use HasFactory;

protected $fillable = [
'title',
'description',
'completed',
];
}

Setting Up the API Routes

Define the routes for your API in the routes/web.php file. Laravel provides a convenient way to create RESTful routes using Route::apiResource:

use App\Http\Controllers\TaskController;

Route::apiResource('tasks', TaskController::class);

Creating the TaskController

Generate a controller for handling API requests:

php artisan make:controller TaskController --api

In the generated TaskController (app/Http/Controllers/TaskController.php), implement the methods to handle CRUD operations:

namespace App\Http\Controllers;

use App\Models\Task;
use Illuminate\Http\Request;

class TaskController extends Controller
{
public function index()
{
return Task::all();
}

public function store(Request $request)
{
$task = Task::create($request->all());
return response()->json($task, 201);
}

public function show($id)
{
return Task::findOrFail($id);
}

public function update(Request $request, $id)
{
$task = Task::findOrFail($id);
$task->update($request->all());
return response()->json($task, 200);
}

public function destroy($id)
{
Task::findOrFail($id)->delete();
return response()->json(null, 204);
}
}

Running the Laravel Application

To start your Laravel application, use the built-in development server:

php artisan serve

Your API will be accessible at http://127.0.0.1:8000/api/tasks.

Disabling CSRF for API Routes

Since this is a REST API, you can disable CSRF protection for API routes. Laravel handles this by default in the api middleware group. To manually configure it, you can exclude the api/* routes from CSRF protection in bootstrap/app.php:

...
->withMiddleware(function (Middleware $middleware) {
$middleware->validateCsrfTokens(
except: ['api/*']
);
})
...

Testing the API

List All Tasks

curl -X GET http://127.0.0.1:8000/api/tasks

Create a New Task

curl -X POST http://127.0.0.1:8000/api/tasks \
-H "Content-Type: application/json" \
-d '{
"title": "Buy groceries",
"description": "Milk, Bread, Cheese",
"completed": false
}'

Get a Specific Task

curl -X GET http://127.0.0.1:8000/api/tasks/1

Update a Task

curl -X PUT http://127.0.0.1:8000/api/tasks/1 \
-H "Content-Type: application/json" \
-d '{
"title": "Buy groceries and snacks",
"description": "Milk, Bread, Cheese, Chips",
"completed": false
}'

Delete a Task

curl -X DELETE http://127.0.0.1:8000/api/tasks/1

Conclusion

Congratulations! You've successfully built a simple Todo List REST API using Laravel and PostgreSQL. This foundation can be expanded with additional features, authentication, and more to create a fully functional web application.

Whether you're learning Laravel or looking to integrate a PostgreSQL database with your projects, this guide serves as a practical starting point.

tip

You can find the complete source code for this project on GitHub.

Streamlining Database Migrations with Spring Boot Flyway and PostgreSQL

· 5 min read
Huseyin BABAL
Software Developer

Introduction: Why Database Migrations Matter?

In the fast-paced world of software development, change is inevitable. As applications evolve, so too must the databases that support them. Whether you’re adding new features, optimizing performance, or fixing bugs, database schema changes are a critical part of the process. However, managing these changes manually can quickly become complex and error-prone, especially as your team grows or your application scales.

This is where database migrations come into play. Migrations allow you to apply consistent, repeatable changes to your database schema across all environments—development, testing, staging, and production. By automating this process, you reduce the risk of human error, ensure consistency, and maintain a clear history of how your database has evolved over time.

What is Flyway?

Flyway is an open-source database migration tool that makes it easy to manage and track schema changes. It works by applying incremental SQL or Java-based migrations in a controlled manner, ensuring that your database schema is always in sync with your application’s needs. Flyway integrates seamlessly with popular databases like PostgreSQL and frameworks like Spring Boot, making it a powerful tool for modern application development.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

Step-by-Step Guide to Use Flyway Migrations

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build a todo application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
--dependencies=flyway,data-jpa,postgresql \
--type=maven-project \
--javaVersion=21 \
flyway-migrations-demo

Line 2: flyway for Flyway integration, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Now that we initialized the project, go to the folder flyway-migrations-demo and open it with your favourite IDE.

Application Configuration

Next, configure Flyway in your application.properties file. At a minimum, you need to specify the database connection details:

application.properties
spring.datasource.url=jdbc:postgresql://<host>:<port>/<db>
spring.datasource.username=<user>
spring.datasource.password=<password>
spring.flyway.locations=classpath:db/migration

The spring.flyway.locations property specifies the location where Flyway will look for migration files. By default, this is classpath:db/migration.

Create Migration Files

Migration files are where you define the changes to your database schema. Each migration file has a unique version number and a descriptive name. For example:

V1__Add_user_table.sql
V2__Alter_user_table_add_email.sql
  • Version number: Start with "V" followed by a version number (V1, V2, etc.). This helps Flyway determine the order in which migrations should be applied.
  • Separator: Use double underscores __ to separate the version number from the description.
  • Description: Provide a brief description of the migration.

Here’s an example of a simple migration file that creates a user table:

V1__Add_user_table.sql
CREATE TABLE user (
id SERIAL PRIMARY KEY,
username VARCHAR(50) NOT NULL,
password VARCHAR(100) NOT NULL,
email VARCHAR(100) NOT NULL
);

Running Application

When you run your Spring Boot application, Flyway will automatically detect and apply any pending migrations to your PostgreSQL database. You’ll see output in the console indicating that the migrations have been successfully applied. You can run application as follows.

./mvnw spring-boot:run

Best Practices for Migration Files

To ensure smooth database migrations, follow these best practices:

  • Keep Migrations Small and Incremental: Break down complex changes into smaller, manageable steps. This makes it easier to troubleshoot issues and roll back changes if necessary.

  • Use Descriptive Names: The name of each migration should clearly describe its purpose. This makes it easier to understand the history of changes at a glance.

  • Test Migrations Thoroughly: Before applying migrations to production, test them in a staging environment that closely mirrors production. This helps catch any issues early.

  • Avoid Direct Modifications in Production: Always use migrations to make changes to the database schema. Direct modifications can lead to inconsistencies and make it difficult to track changes.

  • Version Control Your Migrations: Store your migration files in version control along with your application code. This ensures that schema changes are tracked and can be rolled back if needed.

Understanding the Flyway Metadata Table

Flyway maintains a metadata table in your database, typically named flyway_schema_history, to track which migrations have been applied. This table contains information such as:

Version: The version number of the migration. Description: The description of the migration. Script: The name of the migration file. Execution Time: How long the migration took to apply. Status: Whether the migration was successful. This table is crucial for managing and auditing your database schema. It ensures that migrations are only applied once and provides a clear history of changes.

Conclusion

Database migrations are a vital part of modern application development, enabling you to manage schema changes in a consistent, repeatable way. By integrating Flyway with Spring Boot and PostgreSQL, you can automate this process and reduce the risk of errors, ensuring that your database schema evolves alongside your application.

Building a Todo API with Rust - A Step-by-Step Guide Using Axum and Diesel

· 7 min read
Huseyin BABAL
Software Developer

Introduction

In the world of web development, performance and safety are paramount. Rust, with its emphasis on speed and memory safety, has emerged as a powerful language for building robust web applications. Today, we'll explore how to create a high-performance RESTful API for a Todo application using Rust, along with two of its most popular libraries: Axum for web services and Diesel for ORM. Rust: A systems programming language that runs blazingly fast and prevents segfaults. Axum: A web application framework that focuses on ergonomics and modularity. Diesel: A safe, extensible ORM and Query Builder for Rust.

Prerequisites

  • Rust
  • Cargo for package management
  • Diesel
  • PostgreSQL In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.
tip

Create a free database in Rapidapp in seconds here

Getting Started

You can initialize the project and add required dependencies as follows;

# Intialize the project
cargo new todo-rs
cd todo-rs
# Add dependencies
cargo add \
axum \
tokio \
serde \
serde_json \
diesel \
dotenvy \
-F tokio/full,serde/derive,diesel/postgres,diesel/r2d2

cargo add is used for dependencies, and if you also add modules for specific crate (package), then we use -F param. For example, if we want to include postgres feature of diesel, the notation will be diesel/postgres. Above command will populate Cargo.toml file as follows;

[package]
name = "todo-rs"
version = "0.1.0"
edition = "2021"

[dependencies]
axum = "0.7.5"
axum-macros = "0.4.1"
diesel = { version = "2.2.2", features = ["postgres", "r2d2"] }
dotenvy = "0.15.7"
serde = { version = "1.0.204", features = ["derive"] }
serde_json = "1.0.122"
tokio = { version = "1.39.2", features = ["full"] }

DB Migration with Diesel

You can initialize the migration for your project for the first time with the following;

diesel setup

This will create migrations folder and diesel.toml in project root folder. In this article, we will implement a Todo REST API, and the only business model we have is todos. In order to generate migrations for todos entity, we can use following command.

diesel migration generate create_todos_table

This will generate a dedicated migration folder for todos table creation. When you open migration folder, you will se there is up.sql and down.sql files. up.sql is executed once we run the migration to apply DB changes. down.sql is used once we revert the DB changes. We are responsible for those sql file as you can see below.

up.sql
CREATE TABLE todos (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL
);
down.sql
DROP TABLE todos;

Now we can run diesel migration run to apply migrations. This will create the table and also will create a src/schema.rs file contains mapped struct for todo entity as follows

src/schema.rs
// @generated automatically by Diesel CLI.

diesel::table! {
todos (id) {
id -> Int4,
title -> Text,
content -> Text,
}
}

You can clearly see, it is generated bt Diesel and you shouldn't manually configure it.

Implement Axum Server

In this section we will be implementing that part responsible for running an HTTP server with axum. This server will expose the handlers for the CRUD operations of Todo entity.

src/main.rs
#[tokio::main]
async fn main() {
dotenv().ok();
let database_url = env::var("DATABASE_URL").expect("DATABASE_URL must be set");
let manager = ConnectionManager::<PgConnection>::new(database_url);
let pool = r2d2::Pool::builder()
.max_size(5)
.build(manager)
.expect("Failed to create pool.");
let db_connection = Arc::new(pool);

let app = Router::new()
.route("/todos", post(handlers::create_todo))
.route("/todos", get(handlers::get_todos))
.route("/todos/:id", get(handlers::get_todo))
.route("/todos/:id", post(handlers::update_todo))
.route("/todos/:id", delete(handlers::delete_todo))
.with_state(db_connection.clone());

let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await.unwrap();
let server = axum::serve(listener, app).with_graceful_shutdown(shutdown_signal());

tokio::spawn(async move {
println!("Server is running");
});

if let Err(e) = server.await {
eprintln!("Server error: {}", e);
}
}

Line 4-10: Set up the database connection pool

Line 12-18: Define the routes for our API

Line 20-21: Set up the server address

Line 23-30: Log application startup or failure

src/main.rs is the file we mostly do our global initializations like database connection pooling setup or preparing REST endpoints. Now that we have endpoints for the Todo entity, let's implement the real logic of those handlers.

Implementing Handlers

Create Todo Handler

In this handler, we accept NewTodo request and will create new record in database. In axum handlers, you can see a state beside request body and they are used for passing dependencies like database connection pools to use for db operations.

src/handlers.rs
pub async fn create_todo(
State(db): State<DbPool>,
Json(new_todo): Json<NewTodo>,
) -> (StatusCode,Json<Todo>) {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let todo = diesel::insert_into(todos::table)
.values(&new_todo)
.get_result(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

(StatusCode::CREATED, Json(todo))
}

Line 2: Accept db connection pool as dependency

Line 3: Request body as NewTodo

Line 5: Get available connection from DB connection pool, throw error otherwise.

Line 7: Insert new_todo in todos table

Line 12: Return CREATED status code and new todo item as response body

List Todos Handler

src/handlers.rs
pub async fn get_todos(
State(db): State<DbPool>,
) -> (StatusCode,Json<Vec<Todo>>) {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let results = todos::table.load::<Todo>(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

(StatusCode::OK, Json(results))
}

This time, we don't expect to see something in body, we just return todos items by using load function and cast them to Todo struct. As always, return results in response body with status code OK

Get Todo Handler

We get the todo id from path params and do a query to todos table by filtering id as follows

src/handlers.rs
pub async fn get_todo(
Path(todo_id): Path<i32>,
State(db): State<DbPool>,
) -> (StatusCode,Json<Todo>) {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let result = todos::table.filter(id.eq(todo_id)).first::<Todo>(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

(StatusCode::OK, Json(result))
}

Update Todo Handler

In this handler, we accept update payload from end user and update existing Todo by resolving the id from path params.

src/handlers.rs
pub async fn update_todo(
Path(todo_id): Path<i32>,
State(db): State<DbPool>,
Json(update_todo): Json<UpdateTodo>,
) -> (StatusCode,Json<Todo>) {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let todo = diesel::update(todos::table.filter(id.eq(todo_id)))
.set(&update_todo)
.get_result(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

(StatusCode::OK, Json(todo))
}

Delete Todo Handler

As you guess, we resolve todo id from path params then execute delete query against todo table as follows.

src/handlers.rs
pub async fn delete_todo(
Path(todo_id): Path<i32>,
State(db): State<DbPool>,
) -> StatusCode {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let _ =diesel::delete(todos::table.filter(id.eq(todo_id)))
.execute(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

StatusCode::NO_CONTENT
}

Demo Time

Right after you set environment variable DATABASE_URL, you can run application as follows;

cargo run

Here are some Todo operations

Create a todo

curl -X POST -H "Content-Type: application/json" -d '{"title":"Buy groceries","content":"banana,milk"}' http://localhost:8080/todos

List all todos

curl http://localhost:8080/todos

Get a specific todo

curl http://localhost:8080/todos/1

Update a todo

curl -X POST -H "Content-Type: application/json" -d '{"title":"Buy Groceries", "content": "banana"}' http://localhost:8080/todos/1

Delete a todo

curl -X DELETE http://localhost:8080/todos/1

Conclusion

We've successfully built a Todo API using Rust, Axum, and Diesel. This combination provides a robust, safe, and efficient backend for web applications. The strong typing of Rust, combined with Diesel's compile-time checked queries and Axum's ergonomic routing, creates a powerful foundation for building scalable web services. By leveraging Rust's performance and safety features, we can create APIs that are not only fast but also resistant to common runtime errors. As you continue to explore Rust for web development, you'll find that this stack provides an excellent balance of developer productivity and application performance. Remember, this is just the beginning. You can extend this API with authentication, more complex queries, and additional features to suit your specific needs. Happy coding!

tip

You can find the complete source code for this project on GitHub.

Streaming PostgreSQL Changes to Kafka with Debezium

· 8 min read
Huseyin BABAL
Software Developer

Introduction: Why Send Changes to Kafka

In modern distributed systems, keeping multiple services in sync and maintaining data consistency across microservices can be challenging. When dealing with microservices architecture, it's crucial to have an efficient way to propagate changes in database to other services in real-time. One effective solution is to publish database changes to message broker like Apache Kafka. Kafka acts as an intermediary that allows various services to subscribe to these changes and react accordingly. This approach ensures real-time data synchronization, reduces the complexity of direct service-to-service communication, and enhances the overall scalability and fault tolerance of the system.

Use Cases for Publishing Database Changes to Kafka

  • Real-Time Analytics: Feeding database changes to a real-time analytics system to provide up-to-the-minute insights.
  • Event-Driven Architecture: Enabling services to react to database changes, triggering workflows or business processes.
  • Cache Invalidation: Automatically invalidating or updating cache entries based on database changes to ensure consistency.
  • Data Replication: Replicating data across different data stores or geographic regions for redundancy and high availability.
  • Audit Logging: Keeping a comprehensive audit log of all changes made to database for compliance and debugging purposes.

What is Debezium?

Debezium is an open-source distributed platform that captures database changes and streams them to Kafka in real-time. It leverages the database's transaction log to detect changes and publish them as events in Kafka topics. Debezium supports various databases, including PostgreSQL, MySQL, and MongoDB, making it a versatile choice for change data capture (CDC) needs.

PostgreSQL Configuration: Logical WAL Replication

In this article, we will be using PostgreSQL as our database with logical WAL replication enabled. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases with built-in logical WAL replication, simplifying setup and maintenance.

tip

Create a free database with built-in logical WAL replication in Rapidapp in seconds here

If you choose to maintain your own PostgreSQL database, you can enable logical WAL replication with following PostgreSQL configuration.

postgresql.conf
...
wal_level = logical
...

You can see more details about WAL Level in PostgreSQL Documentation.

Deploying Debezium Connect with PostgreSQL Connection

There are several ways to deploy Debezium Connect, but we will use Docker for spin up a container to run Debezium Connect as follows.

docker run --rm --name debezium \
-e BOOTSTRAP_SERVERS=<bootstrap_servers> \
-e GROUP_ID=1 \
-e CONFIG_STORAGE_TOPIC=connect_configs \
-e OFFSET_STORAGE_TOPIC=connect_offsets \
-e STATUS_STORAGE_TOPIC=connect_statuses \
-e ENABLE_DEBEZIUM_SCRIPTING='true' \
-e CONNECT_SASL_MECHANISM=SCRAM-SHA-256 \
-e CONNECT_SECURITY_PROTOCOL=SASL_SSL \
-e CONNECT_SASL_JAAS_CONFIG='org.apache.kafka.common.security.scram.ScramLoginModule required username="<username>" password="<password>";' \
-p 8083:8083 debezium/connect:2.7

BOOTSTRAP_SERVERS: You can set bootstrap server for this env variable. You can find this on Upstash dashboard if you are using their managed Kafka.

CONNECT_SASL_JAAS_CONFIG: This part contains security module and username/password pair. You don't need to set this if you are not using Kafka with authentication. However, if you are using Kafka from Upstash, then you can find username and password values on Kafka cluster details page.

CONFIG_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector properties.

OFFSET_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector offsets.

STATUS_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector statuses.

Debezium connect is ready, but it is empty which means, no source will be tracked which is PostgreSQL, and no data will be sent to sink which is Kafka in our case.

We will also leverage two SaaS solutions:

  • Rapidapp for PostgreSQL: To quickly set up and manage our PostgreSQL database.
tip

Create a free database in Rapidapp Starter in seconds here

  • Upstash Redis: A managed Redis service optimized for low-latency data caching.
tip

Create a free Redis database in Upstash here

Adding Debezium Connector

You can add new connector to Debezium Connect by using its REST API as follows.

curl --location 'http://localhost:8083/connectors' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data '{
"name": "postgres-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "<pg_host>",
"database.port": "<pg_port>",
"database.user": "<pg_user>",
"database.password": "<pg_pass>",
"database.dbname": "<pg_db>",
"database.server.id": "<unique_id>",
"table.include.list": "<schema.table_name>",
"topic.prefix": "<pg_topic>",
"plugin.name": "pgoutput",
"kafka.bootstrap.servers": "<kafka_host>:<kafka_port>",
"kafka.topic.prefix": "<kafka_topic_prefix>"
}
}'

Line 7: This is needed to tell Debezium how to connect source.

Line 8-12: PostgreSQL connection properties, if you have used Rapidapp, you can grab details on Connection Properties tab in database details page

Line 13: This is the unique database server id which will be used by Debezium to differentiate th sources.

Line 14: This is the list of tables that will be monitored by Debezium.

Line 16: This field is used to tell Debezium which plugin should be used for this connector to serialize/deserialize data from PostgreSQL bin log.

Once the connector is created, you can verify it by listing available connectors with the following;

curl -XGET http://localhost:8083/connectors

Step-by-Step Spring Boot Application Setup

In this section, we will implement a simple Spring Boot CRUD application where whenever you do a modification in PostgreSQL database, it will be synchronized to Kafka automatically. This will be useful especially some other service is interested in those changes. In our case, we will be maintaining Product information in PostgreSQL database. Let's get started!

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build the application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
--dependencies=web,data-jpa,postgresql,lombok \
--type=maven-project \
--javaVersion=21 \
spring-pg-debezium

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Implementing Entity and Repository

We have only one entity here, Product, which will be used to store product information. Let's create a new entity called Product as follows.

@Entity
@Data
@NoArgsConstructor
@AllArgsConstructor
class Product {

@Id
@GeneratedValue
private Long id;

private String title;

@Column(name = "price", precision = 10, scale = 2)
private BigDecimal price;
}

Line 2: Automatically enable getter/setter methods by using Lombok

Line 3: Generate no-arg constructor

Line 4: Generate constructor with all instance variables

Line 13: Define price column that accepts value with 10 digits max and 2 decimal places e.g. 023.99

In order to manage Product entity in database, we will use following repository interface.

interface ProductRepository extends CrudRepository<Product, Integer>{}

Implementing Rest Endpoints

We have one root endpoint /api/v1/products inside one controller and implement 3 actions for create, update, and delete as follows

@RestController
@RequestMapping("/api/v1/products")
@RequiredArgsConstructor
class ProductController {

private final ProductRepository productRepository;

@PostMapping
void create(@RequestBody CreateProductRequest request) {
Product product = new Product();
product.setTitle(request.getTitle());
product.setPrice(request.getPrice());
productRepository.save(product);
}

@PatchMapping("/{id}")
void update(@RequestBody UpdateProductRequest request, @PathVariable("id") Long id) {
Product p = productRepository.findById(id).orElseThrow(() -> new EntityNotFoundException("Product not found"));
p.setPrice(request.getPrice());
productRepository.save(p);
}

@DeleteMapping("/{id}")
void delete(@PathVariable("id") Long id) {
productRepository.deleteById(id);
}
}

create method accepts a request CreateProductRequest which contains title, and price information as shown below.

@Data
@NoArgsConstructor
@AllArgsConstructor
class CreateProductRequest {

private String title;

private BigDecimal price;

}

update is used to update product price, and it accepts a request as follows.

@Data
@NoArgsConstructor
@AllArgsConstructor
class UpdateProductRequest {

private BigDecimal price;

}

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

This section contains application level configurations such as the application name, datasource, and jpa as shown below:

application.yaml
spring:
application:
name: spring-pg-debezium
datasource:
url: <connection-string-from-rapidapp|or your own managed postgres url>
username: <username>
password: <password>
jpa:
database-platform: org.hibernate.dialect.PostgreSQLDialect
hibernate:
ddl-auto: update

Line 5: Connection URL for the PostgreSQL database. You can obtain this from Rapidapp or your own managed PostgreSQL service. It should have a format like jdbc:postgresql://<host>:<port>/<database>?sslmode=require.

Running Application

You can run application as follows

./mvnw spring-boot:run

Demo

Once you perform any of the following request, you will see it will be published to Kafka cluster where you can consume and see the message.

Create Product

curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/products -d '{"title": "Blue Iphone", "price": "37.3213"}'
``

### Update Product
```bash
curl -XPATCH -H "Content-Type: application/json" http://localhost:8080/api/v1/products/1 -d '{"price": "37.1213"}'

Delete Product

curl -XDELETE  http://localhost:8080/api/v1/products/1

Conclusion

Integrating Debezium with PostgreSQL and Kafka in a Spring Boot environment allows you to efficiently stream database changes to various services. This setup not only enhances data consistency and real-time processing capabilities but also simplifies the architecture of your microservices. By following this guide, you can leverage the power of change data capture to build responsive and scalable applications.

tip

You can find the complete source code for this project on GitHub.

Building Location Based Search Service with Spring Boot PostgreSQL and PostGIS

· 12 min read
Huseyin BABAL
Software Developer

Introduction to Geospatial Data

Geospatial data, also known as spatial data, represents the physical location and shape of objects on the Earth's surface. It includes information such as latitude, longitude, altitude, and the spatial relationships between different objects. Geospatial data is used in a wide range of applications, from mapping and navigation to environmental monitoring and urban planning.

Use Cases for Geospatial Data

Geospatial data has numerous applications across various industries. Some common use cases include:

  • Navigation and Routing: GPS systems use geospatial data to provide real-time navigation and routing information.
  • Environmental Monitoring: Track changes in land use, deforestation, and urban sprawl using satellite imagery and geospatial analysis.
  • Urban Planning: Plan infrastructure projects, analyze traffic patterns, and manage public services using geospatial data.
  • Location-Based Services: Deliver personalized content, offers, and services based on a user's location.

Using Geospatial Data in PostgreSQL with PostGIS Extension

In this article, we will be using PostgreSQL as our database with PostGIS extension. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases with built-in postgis extension, simplifying setup and maintenance.

tip

Create a free database with built-in postgis extension in Rapidapp in seconds here

If you choose to maintain your own PostgreSQL database, you can enable PostGIS extension with the following command for each database as shown below.;

CREATE EXTENSION postgis;

Step-by-Step Guide to Creating the Location-Based Search Service

One practical application of geospatial data is a geolocation search application, where users can find nearby points of interest within a specified radius. In this article, we will build a Spring Boot application that searches for cities within specified radius of a given point.

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build the application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
--dependencies=web,data-jpa,postgresql,lombok \
--type=maven-project \
--javaVersion=21 \
spring-postgres-spatial

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

There is one more dependency we need to add to enable spatial feature of hibernate: hibernate-spatial. Open pom.xml and add following dependency to dependencies section.

<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-spatial</artifactId>
<version>6.5.2.Final</version>
</dependency>

Now that we initialized the project, go to the folder spring-postgres-spatial and open it with your favourite IDE.

Implementing Entity and Repository

We have only one entity here, City, which will be used to store city information including its location. Let's create a new entity called City as follows.

@Entity
@Data
@NoArgsConstructor
@AllArgsConstructor
class City {

@Id
@GeneratedValue
private Long id;

private String name;

@Column(columnDefinition = "geography(Point, 4326)")
private Point location;
}

Line 2: Automatically enable getter/setter methods by using Lombok

Line 3: Generate no-arg constructor

Line 4: Generate constructor with all instance variables

Line 13: This is for using special PostGIS data type geography described as follows;

geography: This indicates that the column will use the PostGIS geography data type, which is designed for storing geospatial data in a way that accounts for the Earth's curvature. This type is particularly useful for global, large-scale datasets where you want accurate distance and area calculations.

Point: Specifies that the data type for this column is a geographic point. Points are used to store coordinates (latitude and longitude).

4326: This is the Spatial Reference System Identifier (SRID) for WGS 84, which is the standard coordinate system used by GPS. SRID 4326 ensures that the coordinates are stored in a globally recognized format.

In order to manage City entity in database, we will use following repository interface.

interface CityRepository extends CrudRepository<City, Integer>{
@Query("SELECT c FROM City c WHERE function('ST_DWithin', c.location, :point, :distance) = true")
Iterable<City> findNearestCities(Point point, double distance);
}

ST_DWithin returns true if the geometries are within a given distance. In our case it will return cities which has location in City table is in a distance :distance of :point

Implementing Rest Endpoints

We have one root endpoint /api/v1/cities inside one controller and implement 3 actions for create, list, and find nearest locations as follows

@RestController
@RequestMapping("/api/v1/cities")
@RequiredArgsConstructor
class CityController {

private final CityRepository cityRepository;

private final GeometryFactory geometryFactory;

@PostMapping
void create(@RequestBody CreateCityRequest request) {
Point point = geometryFactory.createPoint(new Coordinate(request.getLng(), request.getLat()));
City city = new City();
city.setName(request.getName());
city.setLocation(point);
cityRepository.save(city);
}

@GetMapping
List<CityDto> findAll() {
List<CityDto> cities = new ArrayList<>();
cityRepository.findAll().forEach(c -> {
cities.add(new CityDto(c.getName(), c.getLocation().getY(), c.getLocation().getX()));
});
return cities;
}

@GetMapping("/nearest")
List<CityDto> findNearestCities(@RequestParam("lat") float lat, @RequestParam("lng") float lng, @RequestParam("distance") int distance) {
List<CityDto> cities = new ArrayList<>();
Point point = geometryFactory.createPoint(new Coordinate(lng, lat));
cityRepository.findNearestCities(point, distance).forEach(c -> {
cities.add(new CityDto(c.getName(), c.getLocation().getY(), c.getLocation().getX()));
});
return cities;
}
}

Line 8: This comes from hibernate-spatial and it is used to do basic conversions between geometric shapes. In our case, we convert latitude-longitude pair to Point which will be used for repository operations.

create method accepts a request CreateCityRequest which contains name, latitude and longitude information as shown below.

@AllArgsConstructor
@NoArgsConstructor
@Data
class CreateCityRequest {

private String name;
private double lat;
private double lng;
}

findAll is used to list all available cities in the database.

findNearestCities is used for finding neighbour cities for a given coordinate and radius (meters).

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

This section contains application level configurations such as the application name, datasource, and jpa as shown below:

application.yaml
spring:
application:
name: spring-postgres-spatial
datasource:
url: <connection-string-from-rapidapp|or your own managed postgres url>
username: <username>
password: <password>
jpa:
database-platform: org.hibernate.dialect.PostgreSQLDialect
hibernate:
ddl-auto: update

Line 5: Connection URL for the PostgreSQL database. You can obtain this from Rapidapp or your own managed PostgreSQL service. It should have a format like jdbc:postgresql://<host>:<port>/<database>?sslmode=require.

Running Application

You can run application as follows

./mvnw spring-boot:run

Demo

Create City

In this section, we will be creating cities of Turkey

Click to see create city requests
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Adana", "lat": "37.0000", "lng": "35.3213"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Adıyaman", "lat": "37.7648", "lng": "38.2786"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Afyonkarahisar", "lat": "38.7507", "lng": "30.5567"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ağrı", "lat": "39.7191", "lng": "43.0503"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Amasya", "lat": "40.6499", "lng": "35.8353"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ankara", "lat": "39.9208", "lng": "32.8541"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Antalya", "lat": "36.8841", "lng": "30.7056"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Artvin", "lat": "41.1828", "lng": "41.8183"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Aydın", "lat": "37.8560", "lng": "27.8416"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Balıkesir", "lat": "39.6484", "lng": "27.8826"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bilecik", "lat": "40.0567", "lng": "30.0665"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bingöl", "lat": "39.0626", "lng": "40.7696"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bitlis", "lat": "38.3938", "lng": "42.1232"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bolu", "lat": "40.5760", "lng": "31.5788"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Burdur", "lat": "37.4613", "lng": "30.0665"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bursa", "lat": "40.2669", "lng": "29.0634"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çanakkale", "lat": "40.1553", "lng": "26.4142"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çankırı", "lat": "40.6013", "lng": "33.6134"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çorum", "lat": "40.5506", "lng": "34.9556"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Denizli", "lat": "37.7765", "lng": "29.0864"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Diyarbakır", "lat": "37.9144", "lng": "40.2306"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Edirne", "lat": "41.6818", "lng": "26.5623"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Elâzığ", "lat": "38.6810", "lng": "39.2264"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Erzincan", "lat": "39.7500", "lng": "39.5000"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Erzurum", "lat": "39.9000", "lng": "41.2700"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Eskişehir", "lat": "39.7767", "lng": "30.5206"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Gaziantep", "lat": "37.0662", "lng": "37.3833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Giresun", "lat": "40.9128", "lng": "38.3895"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Gümüşhane", "lat": "40.4386", "lng": "39.5086"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Hakkâri", "lat": "37.5833", "lng": "43.7333"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Hatay", "lat": "36.4018", "lng": "36.3498"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Isparta", "lat": "37.7648", "lng": "30.5566"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Mersin", "lat": "36.8000", "lng": "34.6333"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "İstanbul", "lat": "41.0053", "lng": "28.9770"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "İzmir", "lat": "38.4189", "lng": "27.1287"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kars", "lat": "40.6167", "lng": "43.1000"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kastamonu", "lat": "41.3887", "lng": "33.7827"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kayseri", "lat": "38.7312", "lng": "35.4787"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırklareli", "lat": "41.7333", "lng": "27.2167"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırşehir", "lat": "39.1425", "lng": "34.1709"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kocaeli", "lat": "40.8533", "lng": "29.8815"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Konya", "lat": "37.8667", "lng": "32.4833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kütahya", "lat": "39.4167", "lng": "29.9833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Malatya", "lat": "38.3552", "lng": "38.3095"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Manisa", "lat": "38.6191", "lng": "27.4289"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kahramanmaraş", "lat": "37.5858", "lng": "36.9371"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Mardin", "lat": "37.3212", "lng": "40.7245"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Muğla", "lat": "37.2153", "lng": "28.3636"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Muş", "lat": "38.9462", "lng": "41.7539"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Nevşehir", "lat": "38.6939", "lng": "34.6857"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Niğde", "lat": "37.9667", "lng": "34.6833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ordu", "lat": "40.9839", "lng": "37.8764"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Rize", "lat": "41.0201", "lng": "40.5234"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sakarya", "lat": "40.6940", "lng": "30.4358"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Samsun", "lat": "41.2928", "lng": "36.3313"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Siirt", "lat": "37.9333", "lng": "41.9500"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sinop", "lat": "42.0231", "lng": "35.1531"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sivas", "lat": "39.7477", "lng": "37.0179"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tekirdağ", "lat": "40.9833", "lng": "27.5167"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tokat", "lat": "40.3167", "lng": "36.5500"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Trabzon", "lat": "41.0015", "lng": "39.7178"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tunceli", "lat": "39.3074", "lng": "39.4388"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Şanlıurfa", "lat": "37.1591", "lng": "38.7969"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Uşak", "lat": "38.6823", "lng": "29.4082"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Van", "lat": "38.4891", "lng": "43.4089"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Yozgat", "lat": "39.8181", "lng": "34.8147"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Zonguldak", "lat": "41.4564", "lng": "31.7987"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Aksaray", "lat": "38.3687", "lng": "34.0370"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bayburt", "lat": "40.2552", "lng": "40.2249"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Karaman", "lat": "37.1759", "lng": "33.2287"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırıkkale", "lat": "39.8468", "lng": "33.5153"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Batman", "lat": "37.8812", "lng": "41.1351"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Şırnak", "lat": "37.4187", "lng": "42.4918"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bartın", "lat": "41.5811", "lng": "32.4610"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ardahan", "lat": "41.1105", "lng": "42.7022"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Iğdır", "lat": "39.8880", "lng": "44.0048"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Yalova", "lat": "40.6500", "lng": "29.2667"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Karabük", "lat": "41.2061", "lng": "32.6204"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kilis", "lat": "36.7184", "lng": "37.1212"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Osmaniye", "lat": "37.2130", "lng": "36.1763"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Düzce", "lat": "40.8438", "lng": "31.1565"}'

List Cities

curl -XGET http://localhost:8080/api/v1/cities

Find Nearest Cities

To find nearest cities of Ankara within a radius 300km, you can use the following.

curl -XGET http://localhost:8080/api/v1/cities/nearest\?lat\=39.9208\&lng\=32.8541\&distance\=300000

Conclusion

In this article, we explored the power of geospatial data and how to effectively utilize it within a Spring Boot application using PostgreSQL with the PostGIS extension. We covered the fundamental concepts of geospatial data, the benefits of using PostGIS for geospatial operations, and real-world use cases such as navigation, environmental monitoring, urban planning, and location-based services.

tip

You can find the complete source code for this project on GitHub.

Create and Deploy Spring Boot Todo App to Google Cloud Run

· 5 min read
Huseyin BABAL
Software Developer

Introduction

In the rapidly evolving world of software development, deploying applications in a scalable and efficient manner is critical. With the rise of cloud computing, services like Google Cloud Run have become essential for developers looking to deploy containerized applications quickly and effortlessly. In this blog post, we'll walk through deploying a simple todo app built with Spring Boot and PostgreSQL to Google Cloud Run. We'll cover setting up the project, integrating PostgreSQL, and deploying to the cloud, ensuring your app is ready to handle varying loads efficiently.

Why Connection Pooling is Essential for Serverless?

When deploying applications in a serverless environment like Google Cloud Run, managing database connections efficiently becomes crucial. Traditional connection management can lead to issues such as exhausting database connections, especially under load. This is where PgBouncer, a lightweight connection pooler for PostgreSQL, comes into play. It optimizes the usage of database connections, reducing latency and improving the performance of your serverless app. Additionally, it ensures that the application can handle sudden spikes in traffic without overwhelming the database.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database with connection pooling support for the serverless use-cases in Rapidapp in seconds here

Step-by-Step Guide to Creating the Todo App

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build a todo application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
--dependencies=web,data-jpa,postgresql \
--type=maven-project \
--javaVersion=21 \
cloud-run-todo

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Now that we initialized the project, go to the folder cloud-run-todo and open it with your favourite IDE.

Implementing Entity and Repository

We have only one entity here, Todo, which will be used to store our todo items. Let's create a new entity called Todo as follows.

@Entity
class Todo {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private Integer id;
private String description;
private Boolean completed;

public Todo(String description, Boolean completed) {
this.description = description;
this.completed = completed;
}

public Todo() {

}

public Integer getId() {
return id;
}

public void setId(Integer id) {
this.id = id;
}

public String getDescription() {
return description;
}

public void setDescription(String description) {
this.description = description;
}

public Boolean getCompleted() {
return completed;
}

public void setCompleted(Boolean completed) {
this.completed = completed;
}
}

In order to manage Todo entity in database, we will use following repository interface.

interface TodoRepository extends CrudRepository<Todo, Integer>{}

TodoRepository will be used to do crud operations for the Todo entity

Implementing Rest Endpoints

Since we have only one entity, we will have one root endpoint /api/v1/todos inside one controller and implement 2 actions for create and listing todo entities as follows

@RestController
@RequestMapping("/api/v1/todos")
class TodoController {

private final TodoRepository todoRepository;

TodoController(TodoRepository todoRepository) {
this.todoRepository = todoRepository;
}

@PostMapping
void create(@RequestBody CreateTodoRequest request) {
this.todoRepository.save(new Todo(request.getDescription(), false));
}

@GetMapping
Iterable<Todo> list() {
return this.todoRepository.findAll();
}
}

create method accepts a request CreateTodoRequest as shown below.

class CreateTodoRequest {
private String description;

public CreateTodoRequest(String description) {
this.description = description;
}

public CreateTodoRequest() {
}

public String getDescription() {
return description;
}

public void setDescription(String description) {
this.description = description;
}
}

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

In serverless environment, it is best practice to expect PORT environment variable since it might be managed by the serverless provider. We can add following configuration to application.properties

application.properties
server.port=${PORT:8080}

By doing this, if there is an env variable PORT, it will take precedence over the default value of 8080. In order to create tables out of entities automatically, we can use following config.

application.properties
spring.jpa.hibernate.ddl-auto=update

As a final step, we need to create a file called project.toml in the root of the project to tell Cloud Run to use Java 21

project.toml
[[build.env]]
name = "GOOGLE_RUNTIME_VERSION"
value = "21"

Deploying to Google Cloud Run

We will be using gcloud cli to deploy our application to Google Cloud Run. Before running deployment command, you need to prepare datasource url, username, and password for PostgreSQL to pass as an environment variable to application. Use following command to deploy.

gcloud run deploy \
--source . \
--update-env-vars SPRING_DATASOURCE_URL=jdbc:postgresql://<host>:<port>/<db>,SPRING_DATASOURCE_USERNAME=<user>,SPRING_DATASOURCE_PASSWORD=<password>

If you are using Rapidapp as your managed database, do not forget to use Pooling Port as port value to use connection pooling for your database to handle highly concurrent requests.

It will prompt for the name of service, you can press enter to accept default one. It will also prompt for the region, select the number of desired region. If there is no problem, it will deploy your application and print the service url.

Demo

Create Todo

curl -XPOST -H "Content-Type: application/json" https://<your>.a.run.app/api/v1/todos -d '{"description": "buy milk"}'

List Todos

curl -XGET https://<your>.a.run.app/api/v1/todos

Conclusion

Deploying a Spring Boot application to Google Cloud Run is straightforward and efficient, allowing developers to leverage the power of serverless computing. By integrating PostgreSQL with connection pooling using PgBouncer and considering services like RapidApp, you can ensure your application is robust and scalable. With this guide, you're now equipped to deploy your todo app to the cloud, ready to handle real-world workloads with ease.

tip

You can find the complete source code for this project on GitHub.

Automating Image Metadata Extraction with AWS Lambda, Go, and PostgreSQL

· 9 min read
Huseyin BABAL
Software Developer

Introduction

In today's digital age, images play a crucial role in various applications and services. However, managing and extracting metadata from these images can be a challenging task, especially when dealing with large volumes of data. In this article, we'll explore how to leverage AWS Lambda, Go, and PostgreSQL to create an automated system for extracting EXIF data from images and storing it in a database.

What is AWS Lambda?

AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers. It automatically scales your applications in response to incoming requests, making it an ideal solution for event-driven architectures. With Lambda, you only pay for the compute time you consume, making it cost-effective for various use cases.

Use-cases

AWS Lambda can be employed in numerous scenarios, including:

  • Real-time file processing
  • Data transformations
  • Automated backups
  • Scheduled tasks
  • Webhooks and API backends

In our case, we'll use Lambda to process images as they're uploaded to an S3 bucket, extract their EXIF data, and store it in a PostgreSQL database.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database with connection pooling support for the serverless use-cases in Rapidapp in seconds here

Implementation

Project Initialization and Dependencies

In this project we will implement a function by using Go which depends on AWS Lambda and PostgreSQL. You can initialize Go project and install the dependencies as follows.

mkdir aws-lambda-go
cd aws-lambda-go
go mod init aws-lambda-go
go get -u github.com/aws/aws-lambda-go/lambda
go get -u github.com/aws/aws-sdk-go-v2/config
go get -u github.com/aws/aws-sdk-go-v2/service/s3
go get -u github.com/lib/pq

Function Endpoint

main.go
package main
...
import "github.com/aws/aws-lambda-go/lambda"
...
func HandleRequest(ctx context.Context, event events.S3Event) (*string, error) {
// Function logic goes here
}

func main() {
lambda.Start(HandleRequest)
}

Line 5: As always, context is used to control execution logic, and since this function is triggered by an S3 event, we'll use the events.S3Event type. This means, once this function is started to run, we will have a payload that contains the S3 event that triggered the function.

Line 10: In this part, the actual function logic is handled by a wrapper lambda.Start coming from aws-lambda package.

Let's deep dive into actual function logic.

Database Connection

We will be getting database connection url from the environment variables, and then connect to the database. It could be good if we also ping the database to be sure it is healthy.

main.go
connStr := os.Getenv("DB_URL")
db, err := sql.Open("postgres", connStr)
if err != nil {
return nil, fmt.Errorf("failed to open database: %s", err)
}
defer db.Close()

err = db.Ping()
if err != nil {
return nil, fmt.Errorf("failed to ping database: %s", err)
}
fmt.Println("Successfully connected to the database!")

Retrieving Object from S3

Once the function triggerred by S3 event, we will get the object from the S3 bucket as follows.

main.go
sdkConfig, err := config.LoadDefaultConfig(ctx)
if err != nil {
return nil, fmt.Errorf("failed to load SDK config: %s", err)
}
s3Client := s3.NewFromConfig(sdkConfig)

var bucket string
var key string
for _, record := range event.Records {
bucket = record.S3.Bucket.Name
key = record.S3.Object.URLDecodedKey

// Get the object
getObjectOutput, err := s3Client.GetObject(ctx, &s3.GetObjectInput{
Bucket: &bucket,
Key: &key,
})
if err != nil {
return nil, fmt.Errorf("failed to get object %s/%s: %s", bucket, key, err)
}
defer getObjectOutput.Body.Close()
...
}

Line 1: If you have ever used AWS SDKs before, you might have seen the credential chaining operation. AWS SDK can use different methods to resolve credentials to create a session to connect AWS services. If you don't pass anything as credentials, it will try to find the credentials in the environment variables. If it cannot find it, then it will use the AWS metadata to understand the identity. In AWS Lambda environment, it knows how to resolve indentity to construct a session in Go.

Line 14: In this part, we will get the object from S3 bucket. We will be using this object to decode image details to get EXIF information.

Extracting EXIF Data

main.go
buf := new(bytes.Buffer)
_, err = buf.ReadFrom(getObjectOutput.Body)
if err != nil {
return nil, fmt.Errorf("failed to read object %s/%s: %s", bucket, key, err)
}

// Check EXIF data
exifData, err := exif.Decode(buf)
if err != nil {
return nil, fmt.Errorf("failed to decode EXIF data: %s", err)
}

log.Printf("successfully retrieved %s/%s with EXIF DateTime: %v", bucket, key, exifData)

Line 2: Create a reader from S3 object contents to use for decoding EXIF data.

Line 8: Extract EXIF data from image

Store in Postgres Database

There are lots of information in image headers, but in our case we will use 2 fields: make and model.

main.go
// SQL statement
sqlStatement := `INSERT INTO images (bucket, key, model, company) VALUES ($1,$2,$3,$4)`

// Execute the insertion
model, err := exifData.Get(exif.Model)
if err != nil {
return nil, fmt.Errorf("failed to get model: %s", err)
}
company, err := exifData.Get(exif.Make)
if err != nil {
return nil, fmt.Errorf("failed to get company: %s", err)
}
_, err = db.Exec(sqlStatement, bucket, key, model.String(), company.String())
if err != nil {
return nil, fmt.Errorf("failed to execute SQL statement: %s", err)
}

We basically read the EXIF data and insert it into the database. You can use following to create images table in your database.

CREATE TABLE images (
bucket varchar(255),
key varchar(255),
model varchar(255),
company varchar(255)
);

bucket - S3 bucket name key - S3 object key model - Model name of the camera used to take the image company - Company name of the camera used to take the image

Now that we implemented our image metadata extraction, let's take a look at how we can deploy this function to AWS Lambda.

Deployment

Preparing Artifact

There is a reason to have a main function in our functions since we are about to build an executable to pass as bootstrap entrypoint to AWS Lambda environment. We need to build an executable, zip it and upload it to AWS Lambda as a new function.

GOOS=linux GOARCH=arm64 go build -tags lambda.norpc -o bootstrap main.go

We build an executable for linux OS and ARM64 architecture by using the main.go as an entrypoint. We use lambda.norpc tag to exclude the RPC library from the executable. This will prevent the RPC library from being included in the executable. This is only used if you are using 1.X Go runtime. Also, we named the executable as bootstrap, this is the entrypoint for AWS Lambda. It will not be executed if you use another name. Finally, we will zip the executable and upload it to AWS Lambda as a new function.

zip PhotoHandler.zip bootstrap

AWS Requirements

Once we deploy the function, it will require set of permission like;

  • Accessing S3 buckets
  • Being able to create log groups in CloudWatch
  • Being able to write to CloudWatch logs We can create an AWS role with the following policy for this purpose and assign it to the Lambda function
trust-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "logs:CreateLogGroup",
"Resource": "arn:aws:logs:<region>:<account-id>:*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"lambda:InvokeFunction"
],
"Resource": [
"arn:aws:logs:<region>:<account-id>:log-group:/aws/lambda/PhotoHandler:*",
"arn:aws:lambda:<region>:<account-id>:function:PhotoHandler"
]
},
{
"Effect": "Allow",
"Action": "s3:GetObject",
"Resource": "*"
}
]
}

Line 7: This part is used to create CloudWatch log group and log stream. Do not forget to use your region and account ID. You can grab the account id with the following command

aws sts get-caller-identity

Line 17-18: This section contains another set of permission for creating log events, also invoking specific function which is PhotoHandler in our case. Again, do not forget to replace region and account id in your case.

Line 23: This section contains the permission to access S3 buckets.

Now you can store this as trust-policy.json and execute the following command to create role.

aws iam create-role \
--role-name photo-handler \
--assume-role-policy-document \
file://trust-policy.json

Remember this role name since we will use it on AWS Lambda function creation.

AWS Lambda Function Creation

You can create a new lambda function as follows.

aws lambda create-function \
--function-name PhotoHandler \
--runtime provided.al2023 \
--handler bootstrap \
--architectures arm64 \
--role arn:aws:iam::<account-id>:role/photo-handler \
--zip-file fileb://PhotoHandler.zip

Line 3: This is the OS only environment, since we already have binary executable, so this can be provided to this env as entrypoint.

Line 6: Do not forget to replace with your account id, this part is needed for binding role to this specific function. This execution runtime will be able to do the operation provided in the trust policy that we created role out of it in previous section.

Adding S3 Events Trigger

In this section, we will add a trigger for S3 event so that this lambda function will be invoked whenever you upload new image to specific S3 bucket.

s3-notification.json
{
"LambdaFunctionConfigurations": [
{
"LambdaFunctionArn": "arn:aws:lambda:<region>:<account-id>:function:PhotoHandler",
"Events": [
"s3:ObjectCreated:*"
],
"Filter": {
"Key": {
"FilterRules": [
{
"Name": "prefix",
"Value": "acme-images/"
},
{
"Name": "suffix",
"Value": ".jpeg"
}
]
}
}
}
]
}

Now you can configure your bucket for the notifications so that it will trigger this lambda function.

aws s3api put-bucket-notification-configuration \
--bucket acme-images \
--notification-configuration file://s3-notification.json

This configure will ensure sending notification about S3 object creation events to trigger AWS lambda function. This event can be consumed inside the HandleRequest function.

Last Step

Now that we added a trigger to lambda function for S3 events. Whenever you add a new jpeg file to acme-images bucket it will invoke lambda function and it will get EXIF data then finally store in PostgreSQL database.

Conclusion

In this article, we explored how to automate image metadata extraction using AWS Lambda, Go, and PostgreSQL. We demonstrated how to use AWS Lambda to handle S3 events, extract EXIF data from images using the exif package in Go, and store the extracted metadata in a PostgreSQL database using the Rapidapp, PostgreSQL As a Service. There will be more Serverless use-cases in the future, do not forget to subscribe for new articles.

tip

You can find the complete source code for this project on GitHub.

Building Devops AI Assistant with Langchain, Ollama, and PostgreSQL

· 6 min read
Huseyin BABAL
Software Developer

Introduction

Vector databases emerge as a powerful tool for storing and searching high-dimensional data like document embeddings, offering lightning-fast similarity queries. This article delves into leveraging PostgreSQL, a popular relational database, as a vector database with the pgvector extension. We'll explore how to integrate it into a LangChain workflow for building a robust question-answering (QA) system.

What are Vector Databases?

Imagine a vast library holding countless documents. Traditional relational databases might classify them by subject or keyword. But what if you want to find documents most similar to a specific concept or question, even if keywords don't perfectly align? Vector databases excel in this scenario. They store data as numerical vectors in a high-dimensional space, where closeness in the space reflects semantic similarity. This enables efficient retrieval of similar documents based on their meaning, not just exact keyword matches.

PostgreSQL as Vector Database

PostgreSQL, a widely adopted and versatile relational database system, can be empowered with vector search capabilities using the pgvector extension. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

If you maintain PostgreSQL database on your own, you can enable pgvector extension by executing the following command for each database as shown below.

CREATE EXTENSION vector;

LangChain: Building Flexible AI Pipelines

LangChain is a powerful framework that facilitates the construction of modular AI pipelines. It allows you to chain together various AI components seamlessly, enabling the creation of complex and customizable workflows.

Your Use Case: Embedding Data for AI-powered QA

In your specific scenario, you're aiming to leverage vector search to enhance a question-answering system. Here's how the components might fit together:

  • Data Preprocessing: Process your documents (e.g., web pages) using Natural Language Processing (NLP) techniques to extract relevant text content. Generate vector representations of your documents using an appropriate AI library (e.g., OllamaEmbeddings in your code).

  • Embedding Storage with pgvector: Store the document vectors and their corresponding metadata (e.g., titles, URLs) in your PostgreSQL database table using pgvector.

  • Building the LangChain Workflow: Construct a LangChain pipeline that incorporates the following elements:

    • Retriever: This component retrieves relevant documents from your PostgreSQL database using vector similarity search powered by pgvector. When a user poses a question, the retriever searches for documents with vector representations closest to the query's vector.
    • Question Passage Transformer: (Optional) This component can further process the retrieved documents to extract snippets most relevant to the user's query.
    • Language Model (LLM): This component uses the retrieved context (potentially augmented with question-specific passages) to formulate a comprehensive response to the user's question.

DevOps AI Assistant: Step-by-step Implementation

We will implement the application by using Pyhton, and will use Poetry for dependency management.

Project Creation

Create a directory and initiate a project by running the following command:

poetry init

This will create a pyproject.toml file in the current directory.

Dependencies

You can install dependencies by running the following command:

poetry add langchain-cohere \
langchain-postgres \
langchain-community \
html2text \
tiktoken

Once you installed dependencies, you can create a empty main.py file to implement our business logic.

Preparing the PostgreSQL Connection URL

Once you create your database on Rapidapp, or use your own database, you can construct the PostgreSQL connection URL as follows. postgresql+psycopg://<user>:<pass>@<host>:<port>/<db>

Defining the Vector Store

connection = "<connection_string>"
collection_name = "prometheus_docs"
embeddings = OllamaEmbeddings()

vectorstore = PGVector(
embeddings=embeddings,
collection_name=collection_name,
connection=connection,
use_jsonb=True,
)

As you can see, we use embedding in the codebase. Your implementation can interact with different AI providers like OpenAI, HuggingFace, HuggingFace, Ollama, etc. Embedding provides a standard interface for all of them. In our case, we use OllamaEmbeddings, since we will be using Ollama as AI provider.

Line 2: This is the collection name in the PostgreSQL database where we store vector documents. In our case, we will store a couple of Prometheus document to help AI provider to decide answers to user's questions.

Line 5: LangChain has lots of vector store implementations and PGVector is one of them. This will help us to interact vector search with PostgreSQL database.

Indexing Documents

urls = ["https://prometheus.io/docs/prometheus/latest/getting_started/", "https://prometheus.io/docs/prometheus/latest/federation/"]
loader = AsyncHtmlLoader(urls)
docs = loader.load()

htmlToText = Html2TextTransformer()
docs_transformed = htmlToText.transform_documents(docs)

splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
chunk_size=1000, chunk_overlap=0
)
docs = splitter.split_documents(docs_transformed)
vectorstore.add_documents(docs)

Line 1-3: With the help of AsyncLoader, we simply load 2 documentation pages of Prometheus.

Line 5-6: Since we cannot use raw html files, we will convert them to text using Html2TextTransformer.

Line 8-11: RecursiveCharacterTextSplitter helps by chunking large text documents into manageable pieces that comply with vector store limitations, improve embedding efficiency, and potentially enhance retrieval accuracy.

Line 12: Store processed documents into vector store.

Building the LangChain Workflow

retriever = vectorstore.as_retriever()
llm = Ollama()

message = """
Answer this question using the provided context only.

{question}

Context:
{context}
"""

prompt = ChatPromptTemplate.from_messages([("human", message)])

rag_chain = {"context": retriever, "question": RunnablePassthrough()} | prompt | llm
response = rag_chain.invoke("how to federate on prometheus")
print(response)

Above code snippet demonstrates how to use LangChain to retrieve information from a vector store and generate a response using a large language model (LLM) based on the retrieved information. Let's break it down step-by-step:

Line 1: This line assumes you have a vector store set up and imports a function to use it as a retriever within LangChain. The retriever will be responsible for fetching relevant information based on a query.

Line 2: This line initializes an instance of the Ollama LLM, which will be used to generate the response to the question.

Line 4: The code defines a multi-line string variable named message. This string uses a template format to include two sections: question: This section will hold the specific question you want to answer. context: This section will contain the relevant background information for the question.

Line 13: Generates chat prompt template.

Line 15: Here the question and context is piped to template to generate prompt, then passed to llm to generate the response. Be sure this is a runnable chain.

Line 16: We invoke the chain with a question and get the response.

Conclusion

In this practical guide, we've delved into using PostgreSQL as a vector database, leveraging the pgvector extension. We explored how this approach can be used to build a context-aware AI assistant, focusing on Prometheus documentation as an example. By storing document embeddings alongside their metadata, we enabled the assistant to retrieve relevant information based on semantic similarity, going beyond simple keyword matching. LangChain played a crucial role in this process. Its modular framework allowed us to effortlessly connect various AI components, like PGVector for vector retrieval and OllamaEmbeddings for interacting with our chosen AI provider. Furthermore, LangChain's ability to incorporate context within user questions significantly enhances the relevance and accuracy of the assistant's responses.

tip

You can find the complete source code for this project on GitHub.