Blog | Rapidapp Documentation & Blog

Simplifying DB Auditing with Spring Boot and Javers

September 8, 2024 · 8 min read

Software Developer

Introduction

Database auditing is essential for tracking changes to data over time, providing a clear history of who changed what and when. It helps ensure compliance with security and legal standards, allowing developers to detect anomalies, recover data, or conduct audits efficiently. In this article, we will explore how to implement automatic database auditing in a Spring Boot application using Javers, a powerful Java library that simplifies object auditing and diffing.

What is Javers?

Javers is a Java library designed for auditing object changes, often in databases. By comparing versions of objects, Javers creates an audit trail of differences between them. Unlike traditional database auditing methods that rely on triggers or manually logging changes, Javers operates at the application layer, making it easier to track object changes while maintaining flexibility.

Use Cases for Javers

Tracking Entity Changes: Javers can record changes to your domain objects and persist those changes, enabling easy retrieval of past object versions.
Data Synchronization: It can be used in distributed systems to ensure data consistency across multiple systems by comparing snapshots of the same object.
Data Governance and Compliance: For businesses that need to comply with regulations like GDPR, Javers provides an efficient way to track and monitor changes.

Understanding Javers' Tables

Javers uses a set of database tables to store the change history. These tables include:

jv_commit: Stores information about who made changes, the timestamp, and the commit ID.
jv_snapshot: Records object state at the time of a commit.
jv_global_id: Manages global identifiers for entities.
jv_commit_property: Stores metadata related to the commit, like environment details. These tables are automatically created when Javers is integrated with your Spring Boot project, allowing you to persist audit information without any manual setup.

In Javers, the type column is used to indicate the type of change captured in a snapshot. This column helps distinguish between various operations on an entity. Here's an explanation of the most common type values:

INITIAL: Indicates the first time an object is added to the database or tracked by Javers. This value represents the creation of the entity, meaning that there was no prior state to compare against.
UPDATE: Represents changes made to an existing object. When any field within the entity is modified, Javers captures the updated state and marks the change with this type. It indicates that this is not a new entity but an updated one.
TERMINAL: This value signifies that the entity has been deleted. When an object is removed from the system, Javers marks this event with a TERMINAL entry to signify the final state of the object before deletion.
INITIALIZED: This type appears in some cases to signify that a value was initialized or restored from a default state.

Persistence Layer

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

Step-by-Step Implementation

We will implement a REST endpoint to create and update products to see db audit history by using Javers. It contains implementing entity, repository interface and a controller to create/update product.

Dependencies

Be sure you have the following dependencies installed by using your favourite dependency management tool e.g. maven, gradle.

pom.xml
 <dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-security</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-data-jpa</artifactId>
    </dependency>
    <dependency>
        <groupId>org.postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <version>42.7.3</version>
    </dependency>
    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <version>1.18.32</version>
    </dependency>
    <!-- This is for automatic db auiditing with the integration to spring security -->
    <dependency>
        <groupId>org.javers</groupId>
        <artifactId>javers-spring-boot-starter-sql</artifactId>
        <version>7.6.1</version>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>
</dependencies>

Spring Web Security Configuration

In order to enable Spring Web Security and to be able to send request to REST endpoints, you need to configure it in your SecurityConfig.java file as shown below. By default, Spring expects to see csrf tokens in every request. With the following configuration, you can disable csrf protection.

SecurityConfig.java
@Configuration
@EnableWebSecurity
class WebSecurityConfiguration {

    @Bean
    public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
        http.csrf(AbstractHttpConfigurer::disable)
                .authorizeRequests(authorize -> authorize
                        .anyRequest().authenticated()
                )
                .httpBasic(withDefaults())
                .sessionManagement(session -> session.sessionCreationPolicy(SessionCreationPolicy.STATELESS));
        return http.build();
    }
}

Line 2: Add @EnableWebSecurity to the SecurityConfig class to protect the API endpoints.

Line 7: Disable CSRF protection.

Line 12: Set the session creation policy to STATELESS to ensure sessions are not maintained.

Add the Product Entity

Create a simple Product entity that will be audited by Javers.

Product.java
@Entity
@NoArgsConstructor
@AllArgsConstructor
@Data
class Product {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String name;
}

Implement the Product Repository

Create a simple ProductRepository interface that will be audited by Javers.

ProductRepository.java

@JaversSpringDataAuditable
interface ProductRepository extends CrudRepository<Product, Long> {}

Line 1: Add @JaversSpringDataAuditable to the ProductRepository interface to enable Javers auditing for the product entity automatically.

Implement the Product Controller

Create a simple ProductController class for creating and updating products. We use POST method for creating, and PATCH method for updating single field of Product.

ProductController.java
@RestController
@RequestMapping("/products")
@RequiredArgsConstructor
class ProductController {

    private final ProductRepository productRepository;

    @PostMapping
    public Product createProduct() {
        Product product = new Product();
        product.setName("Product " + System.currentTimeMillis());
        return productRepository.save(product);
    }

    @PatchMapping
    public void updateProduct() {
        productRepository.findById(1L).ifPresent(product -> {
            product.setName("Product " + System.currentTimeMillis());
            productRepository.save(product);
        });
    }
}

Application Configuration

Next, configure database access in your application.yaml file. At a minimum, you need to specify the database connection details:

application.yaml
spring:
  application:
    name: spring-db-audit
  datasource:
    driver-class-name: org.postgresql.Driver
    url: jdbc:postgresql://<host>:<port>/<user>?sslmode=require&application_name=spring-db-audit&user=<user>&password=<password>
  jpa:
    database-platform: org.hibernate.dialect.PostgreSQLDialect
    hibernate:
      ddl-auto: update
  security:
    user:
      name: developer
      password: s3cr3t

Line 13: Set the user property to developer and password to s3cr3t. We will use those credentials in curl requests to test the application.

Demo

Create Product

curl -XPOST \
    -u developer:s3cr3t -H \
    "Content-Type: application/json" \
    http://localhost:8080/products -v

Javers will detect the following changes in the Product entity and save them in the database as follows.

> select *from jv_commit;
 commit_pk | author |       commit_date       |     commit_date_instant     | commit_id
-----------+--------+-------------------------+-----------------------------+-----------
       100 | user   | 2024-09-08 01:58:49.728 | 2024-09-07T22:58:49.728964Z |      1.00

> select *from jv_snapshot;
 snapshot_pk |  type   | version |               state                | changed_properties |            managed_type            | global_id_fk | commit_fk
-------------+---------+---------+------------------------------------+--------------------+------------------------------------+--------------+-----------
         100 | INITIAL |       1 | {                                 +| [                 +| com.rapidapp.springdbaudit.Product |          100 |       100
             |         |         |   "name": "Product 1725749929156",+|   "name",         +|                                    |              |
             |         |         |   "id": 1                         +|   "id"            +|                                    |              |
             |         |         | }                                  | ]                  |                                    |              |

Since this is a create operation, it is saved as INITIAL. Let's update product entity to see how Javers maintains its changes.

Update Product

curl -XPATCH \
    -u developer:s3cr3t -H \
    "Content-Type: application/json" \
    http://localhost:8080/products -v

Javers will detect the following changes in the Product entity and save them in the database as follows.

> select *from jv_snapshot WHERE type='UPDATE';
 snapshot_pk |  type  | version |               state                | changed_properties |            managed_type            | global_id_fk | commit_fk
-------------+--------+---------+------------------------------------+--------------------+------------------------------------+--------------+-----------
         500 | UPDATE |       2 | {                                 +| [                 +| com.rapidapp.springdbaudit.Product |          100 |       400
             |        |         |   "name": "Product 1725777539405",+|   "name"          +|                                    |              |
             |        |         |   "id": 1                         +| ]                  |                                    |              |
             |        |         | }                                  |                    |                                    |              |

As you can see;

type field is UPDATE
changed_properties is name to understand which field(s) are changed.
We can follow the history by checking version field. So, who did this change? Let's take a look at jv_commit table.

> select *from jv_commit order by commit_id desc limit 1;
 commit_pk | author |       commit_date       |     commit_date_instant     | commit_id
-----------+--------+-------------------------+-----------------------------+-----------
       400 | john   | 2024-09-08 09:38:59.639 | 2024-09-08T06:38:59.639471Z |      5.00

You can see the author field to see the owner of this change. As you can see, john is the owner of this change.

Conclusion

By using Javers with Spring Boot, database auditing becomes simple and flexible. Javers' ability to track object changes over time and store audit data in a structured way makes it a great choice for any application that requires a detailed history of data changes. With this implementation, you can now easily integrate automatic auditing in your Spring Boot applications.

tip

You can find the complete source code for this project on GitHub.

Building Simple Todo REST API with PHP, Laravel, and PostgreSQL

August 25, 2024 · 4 min read

Huseyin BABAL

Software Developer

Introduction

In this tutorial, we’ll walk through building a basic Todo REST API using Laravel and PostgreSQL. Whether you’re looking to enhance your understanding of Laravel, or you're interested in integrating a PostgreSQL database into your project, this guide will provide you with the necessary steps to get up and running quickly.

Prerequisites

tip

Create a free database in Rapidapp in seconds here

Step-by-Step Implementation

Setting Up Your Laravel Project

First, let's create a new Laravel project. You can do this using Composer:

composer create-project --prefer-dist laravel/laravel laravel-todo-api

Next, navigate to the project directory:

cd laravel-todo-api

Configuring PostgreSQL Database

Open the .env file in the root directory of your Laravel project and update the database configuration to use PostgreSQL:

DB_CONNECTION=pgsql
DB_HOST=<host>
DB_PORT=<port>
DB_DATABASE=<database>
DB_USERNAME=<user>
DB_PASSWORD=<password>

Creating the Task Model and Migration

Run the following command to generate a model and migration for your tasks:

php artisan make:model Task -m

This command creates a Task model and a migration file located in the database/migrations directory.

Defining the Database Schema

Open the generated migration file in database/migrations and define the schema for the tasks table:

public function up()
{
    Schema::create('tasks', function (Blueprint $table) {
        $table->id();
        $table->string('title');
        $table->text('description')->nullable();
        $table->boolean('completed')->default(false);
        $table->timestamps();
    });
}

Run the migration to create the table:

php artisan migrate

Implementing the Task Model

To allow mass assignment, update the Task model (app/Models/Task.php) by adding the title, description, and completed attributes to the fillable array:

namespace App\Models;

use Illuminate\Database\Eloquent\Factories\HasFactory;
use Illuminate\Database\Eloquent\Model;

class Task extends Model
{
    use HasFactory;

    protected $fillable = [
        'title',
        'description',
        'completed',
    ];
}

Setting Up the API Routes

Define the routes for your API in the routes/web.php file. Laravel provides a convenient way to create RESTful routes using Route::apiResource:

use App\Http\Controllers\TaskController;

Route::apiResource('tasks', TaskController::class);

Creating the TaskController

Generate a controller for handling API requests:

php artisan make:controller TaskController --api

In the generated TaskController (app/Http/Controllers/TaskController.php), implement the methods to handle CRUD operations:

namespace App\Http\Controllers;

use App\Models\Task;
use Illuminate\Http\Request;

class TaskController extends Controller
{
    public function index()
    {
        return Task::all();
    }

    public function store(Request $request)
    {
        $task = Task::create($request->all());
        return response()->json($task, 201);
    }

    public function show($id)
    {
        return Task::findOrFail($id);
    }

    public function update(Request $request, $id)
    {
        $task = Task::findOrFail($id);
        $task->update($request->all());
        return response()->json($task, 200);
    }

    public function destroy($id)
    {
        Task::findOrFail($id)->delete();
        return response()->json(null, 204);
    }
}

Running the Laravel Application

To start your Laravel application, use the built-in development server:

php artisan serve

Your API will be accessible at http://127.0.0.1:8000/api/tasks.

Disabling CSRF for API Routes

Since this is a REST API, you can disable CSRF protection for API routes. Laravel handles this by default in the api middleware group. To manually configure it, you can exclude the api/* routes from CSRF protection in bootstrap/app.php:

...
->withMiddleware(function (Middleware $middleware) {
        $middleware->validateCsrfTokens(
            except: ['api/*']
        );
    })
...

Testing the API

List All Tasks

curl -X GET http://127.0.0.1:8000/api/tasks

Create a New Task

curl -X POST http://127.0.0.1:8000/api/tasks \
-H "Content-Type: application/json" \
-d '{
    "title": "Buy groceries",
    "description": "Milk, Bread, Cheese",
    "completed": false
}'

Get a Specific Task

curl -X GET http://127.0.0.1:8000/api/tasks/1

Update a Task

curl -X PUT http://127.0.0.1:8000/api/tasks/1 \
-H "Content-Type: application/json" \
-d '{
    "title": "Buy groceries and snacks",
    "description": "Milk, Bread, Cheese, Chips",
    "completed": false
}'

Delete a Task

curl -X DELETE http://127.0.0.1:8000/api/tasks/1

Conclusion

Congratulations! You've successfully built a simple Todo List REST API using Laravel and PostgreSQL. This foundation can be expanded with additional features, authentication, and more to create a fully functional web application.

Whether you're learning Laravel or looking to integrate a PostgreSQL database with your projects, this guide serves as a practical starting point.

tip

You can find the complete source code for this project on GitHub.

Streamlining Database Migrations with Spring Boot Flyway and PostgreSQL

August 17, 2024 · 5 min read

Huseyin BABAL

Software Developer

Introduction: Why Database Migrations Matter?

In the fast-paced world of software development, change is inevitable. As applications evolve, so too must the databases that support them. Whether you’re adding new features, optimizing performance, or fixing bugs, database schema changes are a critical part of the process. However, managing these changes manually can quickly become complex and error-prone, especially as your team grows or your application scales.

This is where database migrations come into play. Migrations allow you to apply consistent, repeatable changes to your database schema across all environments—development, testing, staging, and production. By automating this process, you reduce the risk of human error, ensure consistency, and maintain a clear history of how your database has evolved over time.

What is Flyway?

Flyway is an open-source database migration tool that makes it easy to manage and track schema changes. It works by applying incremental SQL or Java-based migrations in a controlled manner, ensuring that your database schema is always in sync with your application’s needs. Flyway integrates seamlessly with popular databases like PostgreSQL and frameworks like Spring Boot, making it a powerful tool for modern application development.

tip

Create a free database in Rapidapp in seconds here

Step-by-Step Guide to Use Flyway Migrations

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build a todo application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
  --dependencies=flyway,data-jpa,postgresql \
  --type=maven-project \
  --javaVersion=21 \
  flyway-migrations-demo

Line 2: flyway for Flyway integration, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Now that we initialized the project, go to the folder flyway-migrations-demo and open it with your favourite IDE.

Application Configuration

Next, configure Flyway in your application.properties file. At a minimum, you need to specify the database connection details:

application.properties
spring.datasource.url=jdbc:postgresql://<host>:<port>/<db>
spring.datasource.username=<user>
spring.datasource.password=<password>
spring.flyway.locations=classpath:db/migration

The spring.flyway.locations property specifies the location where Flyway will look for migration files. By default, this is classpath:db/migration.

Create Migration Files

Migration files are where you define the changes to your database schema. Each migration file has a unique version number and a descriptive name. For example:

V1__Add_user_table.sql
V2__Alter_user_table_add_email.sql

Version number: Start with "V" followed by a version number (V1, V2, etc.). This helps Flyway determine the order in which migrations should be applied.
Separator: Use double underscores __ to separate the version number from the description.
Description: Provide a brief description of the migration.

Here’s an example of a simple migration file that creates a user table:

V1__Add_user_table.sql
CREATE TABLE user (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    password VARCHAR(100) NOT NULL,
    email VARCHAR(100) NOT NULL
);

Running Application

When you run your Spring Boot application, Flyway will automatically detect and apply any pending migrations to your PostgreSQL database. You’ll see output in the console indicating that the migrations have been successfully applied. You can run application as follows.

./mvnw spring-boot:run

Best Practices for Migration Files

To ensure smooth database migrations, follow these best practices:

Keep Migrations Small and Incremental: Break down complex changes into smaller, manageable steps. This makes it easier to troubleshoot issues and roll back changes if necessary.
Use Descriptive Names: The name of each migration should clearly describe its purpose. This makes it easier to understand the history of changes at a glance.
Test Migrations Thoroughly: Before applying migrations to production, test them in a staging environment that closely mirrors production. This helps catch any issues early.
Avoid Direct Modifications in Production: Always use migrations to make changes to the database schema. Direct modifications can lead to inconsistencies and make it difficult to track changes.
Version Control Your Migrations: Store your migration files in version control along with your application code. This ensures that schema changes are tracked and can be rolled back if needed.

Understanding the Flyway Metadata Table

Flyway maintains a metadata table in your database, typically named flyway_schema_history, to track which migrations have been applied. This table contains information such as:

Version: The version number of the migration. Description: The description of the migration. Script: The name of the migration file. Execution Time: How long the migration took to apply. Status: Whether the migration was successful. This table is crucial for managing and auditing your database schema. It ensures that migrations are only applied once and provides a clear history of changes.

Conclusion

Database migrations are a vital part of modern application development, enabling you to manage schema changes in a consistent, repeatable way. By integrating Flyway with Spring Boot and PostgreSQL, you can automate this process and reduce the risk of errors, ensuring that your database schema evolves alongside your application.

Building a Todo API with Rust - A Step-by-Step Guide Using Axum and Diesel

August 3, 2024 · 7 min read

Huseyin BABAL

Software Developer

Introduction

In the world of web development, performance and safety are paramount. Rust, with its emphasis on speed and memory safety, has emerged as a powerful language for building robust web applications. Today, we'll explore how to create a high-performance RESTful API for a Todo application using Rust, along with two of its most popular libraries: Axum for web services and Diesel for ORM. Rust: A systems programming language that runs blazingly fast and prevents segfaults. Axum: A web application framework that focuses on ergonomics and modularity. Diesel: A safe, extensible ORM and Query Builder for Rust.

Prerequisites

Rust
Cargo for package management
Diesel
PostgreSQL In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

Getting Started

You can initialize the project and add required dependencies as follows;

# Intialize the project
cargo new todo-rs
cd todo-rs
# Add dependencies
cargo add \
  axum \
  tokio \
  serde \
  serde_json \
  diesel \
  dotenvy \
  -F tokio/full,serde/derive,diesel/postgres,diesel/r2d2

cargo add is used for dependencies, and if you also add modules for specific crate (package), then we use -F param. For example, if we want to include postgres feature of diesel, the notation will be diesel/postgres. Above command will populate Cargo.toml file as follows;

[package]
name = "todo-rs"
version = "0.1.0"
edition = "2021"

[dependencies]
axum = "0.7.5"
axum-macros = "0.4.1"
diesel = { version = "2.2.2", features = ["postgres", "r2d2"] }
dotenvy = "0.15.7"
serde = { version = "1.0.204", features = ["derive"] }
serde_json = "1.0.122"
tokio = { version = "1.39.2", features = ["full"] }

DB Migration with Diesel

You can initialize the migration for your project for the first time with the following;

diesel setup

This will create migrations folder and diesel.toml in project root folder. In this article, we will implement a Todo REST API, and the only business model we have is todos. In order to generate migrations for todos entity, we can use following command.

diesel migration generate create_todos_table

This will generate a dedicated migration folder for todos table creation. When you open migration folder, you will se there is up.sql and down.sql files. up.sql is executed once we run the migration to apply DB changes. down.sql is used once we revert the DB changes. We are responsible for those sql file as you can see below.

up.sql
CREATE TABLE todos (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    content TEXT NOT NULL
);

down.sql
DROP TABLE todos;

Now we can run diesel migration run to apply migrations. This will create the table and also will create a src/schema.rs file contains mapped struct for todo entity as follows

src/schema.rs
// @generated automatically by Diesel CLI.

diesel::table! {
    todos (id) {
        id -> Int4,
        title -> Text,
        content -> Text,
    }
}

You can clearly see, it is generated bt Diesel and you shouldn't manually configure it.

Implement Axum Server

In this section we will be implementing that part responsible for running an HTTP server with axum. This server will expose the handlers for the CRUD operations of Todo entity.

src/main.rs
#[tokio::main]
async fn main() {
    dotenv().ok();
    let database_url = env::var("DATABASE_URL").expect("DATABASE_URL must be set");
    let manager = ConnectionManager::<PgConnection>::new(database_url);
    let pool = r2d2::Pool::builder()
        .max_size(5)
        .build(manager)
        .expect("Failed to create pool.");
    let db_connection = Arc::new(pool);

    let app = Router::new()
        .route("/todos", post(handlers::create_todo))
        .route("/todos", get(handlers::get_todos))
        .route("/todos/:id", get(handlers::get_todo))
        .route("/todos/:id", post(handlers::update_todo))
        .route("/todos/:id", delete(handlers::delete_todo))
        .with_state(db_connection.clone());

    let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await.unwrap();
    let server = axum::serve(listener, app).with_graceful_shutdown(shutdown_signal());

    tokio::spawn(async move {
        println!("Server is running");
    });

    if let Err(e) = server.await {
        eprintln!("Server error: {}", e);
    }
}

Line 4-10: Set up the database connection pool

Line 12-18: Define the routes for our API

Line 20-21: Set up the server address

Line 23-30: Log application startup or failure

src/main.rs is the file we mostly do our global initializations like database connection pooling setup or preparing REST endpoints. Now that we have endpoints for the Todo entity, let's implement the real logic of those handlers.

Implementing Handlers

Create Todo Handler

In this handler, we accept NewTodo request and will create new record in database. In axum handlers, you can see a state beside request body and they are used for passing dependencies like database connection pools to use for db operations.

src/handlers.rs
pub async fn create_todo(
    State(db): State<DbPool>,
    Json(new_todo): Json<NewTodo>,
) -> (StatusCode,Json<Todo>) {
    let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    let todo = diesel::insert_into(todos::table)
        .values(&new_todo)
        .get_result(&mut conn)
        .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    (StatusCode::CREATED, Json(todo))
}

Line 2: Accept db connection pool as dependency

Line 3: Request body as NewTodo

Line 5: Get available connection from DB connection pool, throw error otherwise.

Line 7: Insert new_todo in todos table

Line 12: Return CREATED status code and new todo item as response body

List Todos Handler

src/handlers.rs
pub async fn get_todos(
    State(db): State<DbPool>,
) -> (StatusCode,Json<Vec<Todo>>) {
    let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    let results = todos::table.load::<Todo>(&mut conn)
        .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    (StatusCode::OK, Json(results))
}

This time, we don't expect to see something in body, we just return todos items by using load function and cast them to Todo struct. As always, return results in response body with status code OK

Get Todo Handler

We get the todo id from path params and do a query to todos table by filtering id as follows

src/handlers.rs
pub async fn get_todo(
    Path(todo_id): Path<i32>,
    State(db): State<DbPool>,
) -> (StatusCode,Json<Todo>) {
    let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    let result = todos::table.filter(id.eq(todo_id)).first::<Todo>(&mut conn)
        .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    (StatusCode::OK, Json(result))
}

Update Todo Handler

In this handler, we accept update payload from end user and update existing Todo by resolving the id from path params.

src/handlers.rs
pub async fn update_todo(
    Path(todo_id): Path<i32>,
    State(db): State<DbPool>,
    Json(update_todo): Json<UpdateTodo>,
) -> (StatusCode,Json<Todo>) {
    let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    let todo = diesel::update(todos::table.filter(id.eq(todo_id)))
        .set(&update_todo)
        .get_result(&mut conn)
        .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    (StatusCode::OK, Json(todo))
}

Delete Todo Handler

As you guess, we resolve todo id from path params then execute delete query against todo table as follows.

src/handlers.rs
pub async fn delete_todo(
    Path(todo_id): Path<i32>,
    State(db): State<DbPool>,
) -> StatusCode {
    let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    let _ =diesel::delete(todos::table.filter(id.eq(todo_id)))
        .execute(&mut conn)
        .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

    StatusCode::NO_CONTENT
}

Demo Time

Right after you set environment variable DATABASE_URL, you can run application as follows;

cargo run

Here are some Todo operations

Create a todo

curl -X POST -H "Content-Type: application/json" -d '{"title":"Buy groceries","content":"banana,milk"}' http://localhost:8080/todos

List all todos

curl http://localhost:8080/todos

Get a specific todo

curl http://localhost:8080/todos/1

Update a todo

curl -X POST -H "Content-Type: application/json" -d '{"title":"Buy Groceries", "content": "banana"}' http://localhost:8080/todos/1

Delete a todo

curl -X DELETE http://localhost:8080/todos/1

Conclusion

We've successfully built a Todo API using Rust, Axum, and Diesel. This combination provides a robust, safe, and efficient backend for web applications. The strong typing of Rust, combined with Diesel's compile-time checked queries and Axum's ergonomic routing, creates a powerful foundation for building scalable web services. By leveraging Rust's performance and safety features, we can create APIs that are not only fast but also resistant to common runtime errors. As you continue to explore Rust for web development, you'll find that this stack provides an excellent balance of developer productivity and application performance. Remember, this is just the beginning. You can extend this API with authentication, more complex queries, and additional features to suit your specific needs. Happy coding!

tip

You can find the complete source code for this project on GitHub.

Streaming PostgreSQL Changes to Kafka with Debezium

July 25, 2024 · 8 min read

Huseyin BABAL

Software Developer

Introduction: Why Send Changes to Kafka

In modern distributed systems, keeping multiple services in sync and maintaining data consistency across microservices can be challenging. When dealing with microservices architecture, it's crucial to have an efficient way to propagate changes in database to other services in real-time. One effective solution is to publish database changes to message broker like Apache Kafka. Kafka acts as an intermediary that allows various services to subscribe to these changes and react accordingly. This approach ensures real-time data synchronization, reduces the complexity of direct service-to-service communication, and enhances the overall scalability and fault tolerance of the system.

Use Cases for Publishing Database Changes to Kafka

Real-Time Analytics: Feeding database changes to a real-time analytics system to provide up-to-the-minute insights.
Event-Driven Architecture: Enabling services to react to database changes, triggering workflows or business processes.
Cache Invalidation: Automatically invalidating or updating cache entries based on database changes to ensure consistency.
Data Replication: Replicating data across different data stores or geographic regions for redundancy and high availability.
Audit Logging: Keeping a comprehensive audit log of all changes made to database for compliance and debugging purposes.

What is Debezium?

Debezium is an open-source distributed platform that captures database changes and streams them to Kafka in real-time. It leverages the database's transaction log to detect changes and publish them as events in Kafka topics. Debezium supports various databases, including PostgreSQL, MySQL, and MongoDB, making it a versatile choice for change data capture (CDC) needs.

PostgreSQL Configuration: Logical WAL Replication

In this article, we will be using PostgreSQL as our database with logical WAL replication enabled. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases with built-in logical WAL replication, simplifying setup and maintenance.

tip

Create a free database with built-in logical WAL replication in Rapidapp in seconds here

If you choose to maintain your own PostgreSQL database, you can enable logical WAL replication with following PostgreSQL configuration.

postgresql.conf
...
wal_level = logical
...

You can see more details about WAL Level in PostgreSQL Documentation.

Deploying Debezium Connect with PostgreSQL Connection

There are several ways to deploy Debezium Connect, but we will use Docker for spin up a container to run Debezium Connect as follows.

docker run --rm --name debezium \
  -e BOOTSTRAP_SERVERS=<bootstrap_servers> \
  -e GROUP_ID=1 \
  -e CONFIG_STORAGE_TOPIC=connect_configs \
  -e OFFSET_STORAGE_TOPIC=connect_offsets \
  -e STATUS_STORAGE_TOPIC=connect_statuses \
  -e ENABLE_DEBEZIUM_SCRIPTING='true' \
  -e CONNECT_SASL_MECHANISM=SCRAM-SHA-256 \
  -e CONNECT_SECURITY_PROTOCOL=SASL_SSL \
  -e CONNECT_SASL_JAAS_CONFIG='org.apache.kafka.common.security.scram.ScramLoginModule required username="<username>" password="<password>";' \
 -p 8083:8083 debezium/connect:2.7

BOOTSTRAP_SERVERS: You can set bootstrap server for this env variable. You can find this on Upstash dashboard if you are using their managed Kafka.

CONNECT_SASL_JAAS_CONFIG: This part contains security module and username/password pair. You don't need to set this if you are not using Kafka with authentication. However, if you are using Kafka from Upstash, then you can find username and password values on Kafka cluster details page.

CONFIG_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector properties.

OFFSET_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector offsets.

STATUS_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector statuses.

Debezium connect is ready, but it is empty which means, no source will be tracked which is PostgreSQL, and no data will be sent to sink which is Kafka in our case.

We will also leverage two SaaS solutions:

Rapidapp for PostgreSQL: To quickly set up and manage our PostgreSQL database.

tip

Create a free database in Rapidapp Starter in seconds here

Upstash Redis: A managed Redis service optimized for low-latency data caching.

tip

Create a free Redis database in Upstash here

Adding Debezium Connector

You can add new connector to Debezium Connect by using its REST API as follows.

curl --location 'http://localhost:8083/connectors' \
   --header 'Accept: application/json' \
   --header 'Content-Type: application/json' \
   --data '{
   "name": "postgres-connector",
   "config": {
       "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
       "database.hostname": "<pg_host>",
       "database.port": "<pg_port>",
       "database.user": "<pg_user>",
       "database.password": "<pg_pass>",
       "database.dbname": "<pg_db>",
       "database.server.id": "<unique_id>",
       "table.include.list": "<schema.table_name>",
       "topic.prefix": "<pg_topic>",
       "plugin.name": "pgoutput",
       "kafka.bootstrap.servers": "<kafka_host>:<kafka_port>",
       "kafka.topic.prefix": "<kafka_topic_prefix>"
   }
}'

Line 7: This is needed to tell Debezium how to connect source.

Line 8-12: PostgreSQL connection properties, if you have used Rapidapp, you can grab details on Connection Properties tab in database details page

Line 13: This is the unique database server id which will be used by Debezium to differentiate th sources.

Line 14: This is the list of tables that will be monitored by Debezium.

Line 16: This field is used to tell Debezium which plugin should be used for this connector to serialize/deserialize data from PostgreSQL bin log.

Once the connector is created, you can verify it by listing available connectors with the following;

curl -XGET http://localhost:8083/connectors

Step-by-Step Spring Boot Application Setup

In this section, we will implement a simple Spring Boot CRUD application where whenever you do a modification in PostgreSQL database, it will be synchronized to Kafka automatically. This will be useful especially some other service is interested in those changes. In our case, we will be maintaining Product information in PostgreSQL database. Let's get started!

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build the application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
  --dependencies=web,data-jpa,postgresql,lombok \
  --type=maven-project \
  --javaVersion=21 \
  spring-pg-debezium

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Implementing Entity and Repository

We have only one entity here, Product, which will be used to store product information. Let's create a new entity called Product as follows.

@Entity
@Data
@NoArgsConstructor
@AllArgsConstructor
class Product {

	@Id
	@GeneratedValue
	private Long id;

	private String title;

	@Column(name = "price", precision = 10, scale = 2)
	private BigDecimal price;
}

Line 2: Automatically enable getter/setter methods by using Lombok

Line 3: Generate no-arg constructor

Line 4: Generate constructor with all instance variables

Line 13: Define price column that accepts value with 10 digits max and 2 decimal places e.g. 023.99

In order to manage Product entity in database, we will use following repository interface.

interface ProductRepository extends CrudRepository<Product, Integer>{}

Implementing Rest Endpoints

We have one root endpoint /api/v1/products inside one controller and implement 3 actions for create, update, and delete as follows

@RestController
@RequestMapping("/api/v1/products")
@RequiredArgsConstructor
class ProductController {

	private final ProductRepository productRepository;

	@PostMapping
	void create(@RequestBody CreateProductRequest request) {
		Product product = new Product();
		product.setTitle(request.getTitle());
		product.setPrice(request.getPrice());
		productRepository.save(product);
	}

	@PatchMapping("/{id}")
	void update(@RequestBody UpdateProductRequest request, @PathVariable("id") Long id) {
		Product p = productRepository.findById(id).orElseThrow(() -> new EntityNotFoundException("Product not found"));
		p.setPrice(request.getPrice());
		productRepository.save(p);
	}

	@DeleteMapping("/{id}")
	void delete(@PathVariable("id") Long id) {
		productRepository.deleteById(id);
	}
}

create method accepts a request CreateProductRequest which contains title, and price information as shown below.

@Data
@NoArgsConstructor
@AllArgsConstructor
class CreateProductRequest {

	private String title;

	private BigDecimal price;

}

update is used to update product price, and it accepts a request as follows.

@Data
@NoArgsConstructor
@AllArgsConstructor
class UpdateProductRequest {

	private BigDecimal price;

}

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

This section contains application level configurations such as the application name, datasource, and jpa as shown below:

application.yaml
spring:
  application:
    name: spring-pg-debezium
  datasource:
    url: <connection-string-from-rapidapp|or your own managed postgres url>
    username: <username>
    password: <password>
  jpa:
    database-platform: org.hibernate.dialect.PostgreSQLDialect
    hibernate:
      ddl-auto: update

Line 5: Connection URL for the PostgreSQL database. You can obtain this from Rapidapp or your own managed PostgreSQL service. It should have a format like jdbc:postgresql://<host>:<port>/<database>?sslmode=require.

Running Application

You can run application as follows

./mvnw spring-boot:run

Demo

Once you perform any of the following request, you will see it will be published to Kafka cluster where you can consume and see the message.

Create Product

curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/products -d '{"title": "Blue Iphone", "price": "37.3213"}'
``

### Update Product
```bash
curl -XPATCH -H "Content-Type: application/json" http://localhost:8080/api/v1/products/1 -d '{"price": "37.1213"}'

Delete Product

curl -XDELETE  http://localhost:8080/api/v1/products/1

Conclusion

Integrating Debezium with PostgreSQL and Kafka in a Spring Boot environment allows you to efficiently stream database changes to various services. This setup not only enhances data consistency and real-time processing capabilities but also simplifies the architecture of your microservices. By following this guide, you can leverage the power of change data capture to build responsive and scalable applications.

tip

You can find the complete source code for this project on GitHub.

Building Location Based Search Service with Spring Boot PostgreSQL and PostGIS

July 17, 2024 · 12 min read

Huseyin BABAL

Software Developer

Introduction to Geospatial Data

Geospatial data, also known as spatial data, represents the physical location and shape of objects on the Earth's surface. It includes information such as latitude, longitude, altitude, and the spatial relationships between different objects. Geospatial data is used in a wide range of applications, from mapping and navigation to environmental monitoring and urban planning.

Use Cases for Geospatial Data

Geospatial data has numerous applications across various industries. Some common use cases include:

Navigation and Routing: GPS systems use geospatial data to provide real-time navigation and routing information.
Environmental Monitoring: Track changes in land use, deforestation, and urban sprawl using satellite imagery and geospatial analysis.
Urban Planning: Plan infrastructure projects, analyze traffic patterns, and manage public services using geospatial data.
Location-Based Services: Deliver personalized content, offers, and services based on a user's location.

Using Geospatial Data in PostgreSQL with PostGIS Extension

In this article, we will be using PostgreSQL as our database with PostGIS extension. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases with built-in postgis extension, simplifying setup and maintenance.

tip

Create a free database with built-in postgis extension in Rapidapp in seconds here

If you choose to maintain your own PostgreSQL database, you can enable PostGIS extension with the following command for each database as shown below.;

CREATE EXTENSION postgis;

Step-by-Step Guide to Creating the Location-Based Search Service

One practical application of geospatial data is a geolocation search application, where users can find nearby points of interest within a specified radius. In this article, we will build a Spring Boot application that searches for cities within specified radius of a given point.

Project Initialization and Dependencies

spring init \
  --dependencies=web,data-jpa,postgresql,lombok \
  --type=maven-project \
  --javaVersion=21 \
  spring-postgres-spatial

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

There is one more dependency we need to add to enable spatial feature of hibernate: hibernate-spatial. Open pom.xml and add following dependency to dependencies section.

<dependency>
    <groupId>org.hibernate</groupId>
    <artifactId>hibernate-spatial</artifactId>
    <version>6.5.2.Final</version>
</dependency>

Now that we initialized the project, go to the folder spring-postgres-spatial and open it with your favourite IDE.

Implementing Entity and Repository

We have only one entity here, City, which will be used to store city information including its location. Let's create a new entity called City as follows.

@Entity
@Data
@NoArgsConstructor
@AllArgsConstructor
class City {

	@Id
	@GeneratedValue
	private Long id;

	private String name;

	@Column(columnDefinition = "geography(Point, 4326)")
	private Point location;
}

Line 2: Automatically enable getter/setter methods by using Lombok

Line 3: Generate no-arg constructor

Line 4: Generate constructor with all instance variables

Line 13: This is for using special PostGIS data type geography described as follows;

geography: This indicates that the column will use the PostGIS geography data type, which is designed for storing geospatial data in a way that accounts for the Earth's curvature. This type is particularly useful for global, large-scale datasets where you want accurate distance and area calculations.

Point: Specifies that the data type for this column is a geographic point. Points are used to store coordinates (latitude and longitude).

4326: This is the Spatial Reference System Identifier (SRID) for WGS 84, which is the standard coordinate system used by GPS. SRID 4326 ensures that the coordinates are stored in a globally recognized format.

In order to manage City entity in database, we will use following repository interface.

interface CityRepository extends CrudRepository<City, Integer>{
    @Query("SELECT c FROM City c WHERE function('ST_DWithin', c.location, :point, :distance) = true")
    Iterable<City> findNearestCities(Point point, double distance);
}

ST_DWithin returns true if the geometries are within a given distance. In our case it will return cities which has location in City table is in a distance :distance of :point

Implementing Rest Endpoints

We have one root endpoint /api/v1/cities inside one controller and implement 3 actions for create, list, and find nearest locations as follows

@RestController
@RequestMapping("/api/v1/cities")
@RequiredArgsConstructor
class CityController {

    private final CityRepository cityRepository;

    private final GeometryFactory geometryFactory;

    @PostMapping
    void create(@RequestBody CreateCityRequest request) {
        Point point = geometryFactory.createPoint(new Coordinate(request.getLng(), request.getLat()));
        City city = new City();
        city.setName(request.getName());
        city.setLocation(point);
        cityRepository.save(city);
    }

    @GetMapping
    List<CityDto> findAll() {
        List<CityDto> cities = new ArrayList<>();
        cityRepository.findAll().forEach(c -> {
            cities.add(new CityDto(c.getName(), c.getLocation().getY(), c.getLocation().getX()));
        });
        return cities;
    }

    @GetMapping("/nearest")
    List<CityDto> findNearestCities(@RequestParam("lat") float lat, @RequestParam("lng") float lng, @RequestParam("distance") int distance) {
        List<CityDto> cities = new ArrayList<>();
        Point point = geometryFactory.createPoint(new Coordinate(lng, lat));
        cityRepository.findNearestCities(point, distance).forEach(c -> {
            cities.add(new CityDto(c.getName(), c.getLocation().getY(), c.getLocation().getX()));
        });
        return cities;
    }
}

Line 8: This comes from hibernate-spatial and it is used to do basic conversions between geometric shapes. In our case, we convert latitude-longitude pair to Point which will be used for repository operations.

create method accepts a request CreateCityRequest which contains name, latitude and longitude information as shown below.

@AllArgsConstructor
@NoArgsConstructor
@Data
class CreateCityRequest {

    private String name;
    private double lat;
    private double lng;
}

findAll is used to list all available cities in the database.

findNearestCities is used for finding neighbour cities for a given coordinate and radius (meters).

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

This section contains application level configurations such as the application name, datasource, and jpa as shown below:

application.yaml
spring:
  application:
    name: spring-postgres-spatial
  datasource:
    url: <connection-string-from-rapidapp|or your own managed postgres url>
    username: <username>
    password: <password>
  jpa:
    database-platform: org.hibernate.dialect.PostgreSQLDialect
    hibernate:
      ddl-auto: update

Running Application

You can run application as follows

./mvnw spring-boot:run

Demo

Create City

In this section, we will be creating cities of Turkey

Click to see create city requests

curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Adana", "lat": "37.0000", "lng": "35.3213"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Adıyaman", "lat": "37.7648", "lng": "38.2786"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Afyonkarahisar", "lat": "38.7507", "lng": "30.5567"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ağrı", "lat": "39.7191", "lng": "43.0503"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Amasya", "lat": "40.6499", "lng": "35.8353"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ankara", "lat": "39.9208", "lng": "32.8541"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Antalya", "lat": "36.8841", "lng": "30.7056"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Artvin", "lat": "41.1828", "lng": "41.8183"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Aydın", "lat": "37.8560", "lng": "27.8416"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Balıkesir", "lat": "39.6484", "lng": "27.8826"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bilecik", "lat": "40.0567", "lng": "30.0665"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bingöl", "lat": "39.0626", "lng": "40.7696"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bitlis", "lat": "38.3938", "lng": "42.1232"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bolu", "lat": "40.5760", "lng": "31.5788"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Burdur", "lat": "37.4613", "lng": "30.0665"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bursa", "lat": "40.2669", "lng": "29.0634"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çanakkale", "lat": "40.1553", "lng": "26.4142"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çankırı", "lat": "40.6013", "lng": "33.6134"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çorum", "lat": "40.5506", "lng": "34.9556"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Denizli", "lat": "37.7765", "lng": "29.0864"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Diyarbakır", "lat": "37.9144", "lng": "40.2306"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Edirne", "lat": "41.6818", "lng": "26.5623"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Elâzığ", "lat": "38.6810", "lng": "39.2264"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Erzincan", "lat": "39.7500", "lng": "39.5000"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Erzurum", "lat": "39.9000", "lng": "41.2700"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Eskişehir", "lat": "39.7767", "lng": "30.5206"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Gaziantep", "lat": "37.0662", "lng": "37.3833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Giresun", "lat": "40.9128", "lng": "38.3895"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Gümüşhane", "lat": "40.4386", "lng": "39.5086"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Hakkâri", "lat": "37.5833", "lng": "43.7333"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Hatay", "lat": "36.4018", "lng": "36.3498"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Isparta", "lat": "37.7648", "lng": "30.5566"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Mersin", "lat": "36.8000", "lng": "34.6333"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "İstanbul", "lat": "41.0053", "lng": "28.9770"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "İzmir", "lat": "38.4189", "lng": "27.1287"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kars", "lat": "40.6167", "lng": "43.1000"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kastamonu", "lat": "41.3887", "lng": "33.7827"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kayseri", "lat": "38.7312", "lng": "35.4787"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırklareli", "lat": "41.7333", "lng": "27.2167"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırşehir", "lat": "39.1425", "lng": "34.1709"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kocaeli", "lat": "40.8533", "lng": "29.8815"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Konya", "lat": "37.8667", "lng": "32.4833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kütahya", "lat": "39.4167", "lng": "29.9833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Malatya", "lat": "38.3552", "lng": "38.3095"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Manisa", "lat": "38.6191", "lng": "27.4289"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kahramanmaraş", "lat": "37.5858", "lng": "36.9371"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Mardin", "lat": "37.3212", "lng": "40.7245"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Muğla", "lat": "37.2153", "lng": "28.3636"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Muş", "lat": "38.9462", "lng": "41.7539"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Nevşehir", "lat": "38.6939", "lng": "34.6857"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Niğde", "lat": "37.9667", "lng": "34.6833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ordu", "lat": "40.9839", "lng": "37.8764"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Rize", "lat": "41.0201", "lng": "40.5234"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sakarya", "lat": "40.6940", "lng": "30.4358"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Samsun", "lat": "41.2928", "lng": "36.3313"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Siirt", "lat": "37.9333", "lng": "41.9500"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sinop", "lat": "42.0231", "lng": "35.1531"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sivas", "lat": "39.7477", "lng": "37.0179"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tekirdağ", "lat": "40.9833", "lng": "27.5167"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tokat", "lat": "40.3167", "lng": "36.5500"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Trabzon", "lat": "41.0015", "lng": "39.7178"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tunceli", "lat": "39.3074", "lng": "39.4388"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Şanlıurfa", "lat": "37.1591", "lng": "38.7969"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Uşak", "lat": "38.6823", "lng": "29.4082"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Van", "lat": "38.4891", "lng": "43.4089"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Yozgat", "lat": "39.8181", "lng": "34.8147"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Zonguldak", "lat": "41.4564", "lng": "31.7987"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Aksaray", "lat": "38.3687", "lng": "34.0370"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bayburt", "lat": "40.2552", "lng": "40.2249"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Karaman", "lat": "37.1759", "lng": "33.2287"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırıkkale", "lat": "39.8468", "lng": "33.5153"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Batman", "lat": "37.8812", "lng": "41.1351"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Şırnak", "lat": "37.4187", "lng": "42.4918"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bartın", "lat": "41.5811", "lng": "32.4610"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ardahan", "lat": "41.1105", "lng": "42.7022"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Iğdır", "lat": "39.8880", "lng": "44.0048"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Yalova", "lat": "40.6500", "lng": "29.2667"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Karabük", "lat": "41.2061", "lng": "32.6204"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kilis", "lat": "36.7184", "lng": "37.1212"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Osmaniye", "lat": "37.2130", "lng": "36.1763"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Düzce", "lat": "40.8438", "lng": "31.1565"}'

List Cities

curl -XGET http://localhost:8080/api/v1/cities

Find Nearest Cities

To find nearest cities of Ankara within a radius 300km, you can use the following.

curl -XGET http://localhost:8080/api/v1/cities/nearest\?lat\=39.9208\&lng\=32.8541\&distance\=300000

Conclusion

In this article, we explored the power of geospatial data and how to effectively utilize it within a Spring Boot application using PostgreSQL with the PostGIS extension. We covered the fundamental concepts of geospatial data, the benefits of using PostGIS for geospatial operations, and real-world use cases such as navigation, environmental monitoring, urban planning, and location-based services.

tip

You can find the complete source code for this project on GitHub.

Create and Deploy Spring Boot Todo App to Google Cloud Run

July 11, 2024 · 5 min read

Huseyin BABAL

Software Developer

Introduction

In the rapidly evolving world of software development, deploying applications in a scalable and efficient manner is critical. With the rise of cloud computing, services like Google Cloud Run have become essential for developers looking to deploy containerized applications quickly and effortlessly. In this blog post, we'll walk through deploying a simple todo app built with Spring Boot and PostgreSQL to Google Cloud Run. We'll cover setting up the project, integrating PostgreSQL, and deploying to the cloud, ensuring your app is ready to handle varying loads efficiently.

Why Connection Pooling is Essential for Serverless?

When deploying applications in a serverless environment like Google Cloud Run, managing database connections efficiently becomes crucial. Traditional connection management can lead to issues such as exhausting database connections, especially under load. This is where PgBouncer, a lightweight connection pooler for PostgreSQL, comes into play. It optimizes the usage of database connections, reducing latency and improving the performance of your serverless app. Additionally, it ensures that the application can handle sudden spikes in traffic without overwhelming the database.

tip

Create a free database with connection pooling support for the serverless use-cases in Rapidapp in seconds here

Step-by-Step Guide to Creating the Todo App

Project Initialization and Dependencies

spring init \
  --dependencies=web,data-jpa,postgresql \
  --type=maven-project \
  --javaVersion=21 \
  cloud-run-todo

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Now that we initialized the project, go to the folder cloud-run-todo and open it with your favourite IDE.

Implementing Entity and Repository

We have only one entity here, Todo, which will be used to store our todo items. Let's create a new entity called Todo as follows.

@Entity
class Todo {
	@Id
	@GeneratedValue(strategy = GenerationType.AUTO)
	private Integer id;
	private String description;
	private Boolean completed;

	public Todo(String description, Boolean completed) {
		this.description = description;
		this.completed = completed;
	}

	public Todo() {

	}

	public Integer getId() {
		return id;
	}

	public void setId(Integer id) {
		this.id = id;
	}

	public String getDescription() {
		return description;
	}

	public void setDescription(String description) {
		this.description = description;
	}

	public Boolean getCompleted() {
		return completed;
	}

	public void setCompleted(Boolean completed) {
		this.completed = completed;
	}
}

In order to manage Todo entity in database, we will use following repository interface.

interface TodoRepository extends CrudRepository<Todo, Integer>{}

TodoRepository will be used to do crud operations for the Todo entity

Implementing Rest Endpoints

Since we have only one entity, we will have one root endpoint /api/v1/todos inside one controller and implement 2 actions for create and listing todo entities as follows

@RestController
@RequestMapping("/api/v1/todos")
class TodoController {

    private final TodoRepository todoRepository;

    TodoController(TodoRepository todoRepository) {
        this.todoRepository = todoRepository;
    }

    @PostMapping
    void create(@RequestBody CreateTodoRequest request) {
        this.todoRepository.save(new Todo(request.getDescription(), false));
	}

    @GetMapping
    Iterable<Todo> list() {
        return this.todoRepository.findAll();
    }
}

create method accepts a request CreateTodoRequest as shown below.

class CreateTodoRequest {
	private String description;

	public CreateTodoRequest(String description) {
		this.description = description;
	}

	public CreateTodoRequest() {
	}

	public String getDescription() {
		return description;
	}

	public void setDescription(String description) {
		this.description = description;
	}
}

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

In serverless environment, it is best practice to expect PORT environment variable since it might be managed by the serverless provider. We can add following configuration to application.properties

application.properties

server.port=${PORT:8080}

By doing this, if there is an env variable PORT, it will take precedence over the default value of 8080. In order to create tables out of entities automatically, we can use following config.

application.properties

spring.jpa.hibernate.ddl-auto=update

As a final step, we need to create a file called project.toml in the root of the project to tell Cloud Run to use Java 21

project.toml
[[build.env]]
name =  "GOOGLE_RUNTIME_VERSION"
value = "21"

Deploying to Google Cloud Run

We will be using gcloud cli to deploy our application to Google Cloud Run. Before running deployment command, you need to prepare datasource url, username, and password for PostgreSQL to pass as an environment variable to application. Use following command to deploy.

gcloud run deploy \
  --source . \
  --update-env-vars SPRING_DATASOURCE_URL=jdbc:postgresql://<host>:<port>/<db>,SPRING_DATASOURCE_USERNAME=<user>,SPRING_DATASOURCE_PASSWORD=<password>

If you are using Rapidapp as your managed database, do not forget to use Pooling Port as port value to use connection pooling for your database to handle highly concurrent requests.

It will prompt for the name of service, you can press enter to accept default one. It will also prompt for the region, select the number of desired region. If there is no problem, it will deploy your application and print the service url.

Demo

Create Todo

curl -XPOST -H "Content-Type: application/json" https://<your>.a.run.app/api/v1/todos -d '{"description": "buy milk"}'

List Todos

curl -XGET https://<your>.a.run.app/api/v1/todos

Conclusion

Deploying a Spring Boot application to Google Cloud Run is straightforward and efficient, allowing developers to leverage the power of serverless computing. By integrating PostgreSQL with connection pooling using PgBouncer and considering services like RapidApp, you can ensure your application is robust and scalable. With this guide, you're now equipped to deploy your todo app to the cloud, ready to handle real-world workloads with ease.

tip

You can find the complete source code for this project on GitHub.

Automating Image Metadata Extraction with AWS Lambda, Go, and PostgreSQL

July 5, 2024 · 9 min read

Huseyin BABAL

Software Developer

Introduction

In today's digital age, images play a crucial role in various applications and services. However, managing and extracting metadata from these images can be a challenging task, especially when dealing with large volumes of data. In this article, we'll explore how to leverage AWS Lambda, Go, and PostgreSQL to create an automated system for extracting EXIF data from images and storing it in a database.

What is AWS Lambda?

AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers. It automatically scales your applications in response to incoming requests, making it an ideal solution for event-driven architectures. With Lambda, you only pay for the compute time you consume, making it cost-effective for various use cases.

Use-cases

AWS Lambda can be employed in numerous scenarios, including:

Real-time file processing
Data transformations
Automated backups
Scheduled tasks
Webhooks and API backends

In our case, we'll use Lambda to process images as they're uploaded to an S3 bucket, extract their EXIF data, and store it in a PostgreSQL database.

tip

Create a free database with connection pooling support for the serverless use-cases in Rapidapp in seconds here

Implementation

Project Initialization and Dependencies

In this project we will implement a function by using Go which depends on AWS Lambda and PostgreSQL. You can initialize Go project and install the dependencies as follows.

mkdir aws-lambda-go
cd aws-lambda-go
go mod init aws-lambda-go
go get -u github.com/aws/aws-lambda-go/lambda
go get -u github.com/aws/aws-sdk-go-v2/config
go get -u github.com/aws/aws-sdk-go-v2/service/s3
go get -u github.com/lib/pq

Function Endpoint

main.go
package main
...
import "github.com/aws/aws-lambda-go/lambda"
...
func HandleRequest(ctx context.Context, event events.S3Event) (*string, error) {
// Function logic goes here
}

func main() {
    lambda.Start(HandleRequest)
}

Line 5: As always, context is used to control execution logic, and since this function is triggered by an S3 event, we'll use the events.S3Event type. This means, once this function is started to run, we will have a payload that contains the S3 event that triggered the function.

Line 10: In this part, the actual function logic is handled by a wrapper lambda.Start coming from aws-lambda package.

Let's deep dive into actual function logic.

Database Connection

We will be getting database connection url from the environment variables, and then connect to the database. It could be good if we also ping the database to be sure it is healthy.

main.go
connStr := os.Getenv("DB_URL")
db, err := sql.Open("postgres", connStr)
if err != nil {
    return nil, fmt.Errorf("failed to open database: %s", err)
}
defer db.Close()

err = db.Ping()
if err != nil {
    return nil, fmt.Errorf("failed to ping database: %s", err)
}
fmt.Println("Successfully connected to the database!")

Retrieving Object from S3

Once the function triggerred by S3 event, we will get the object from the S3 bucket as follows.

main.go
sdkConfig, err := config.LoadDefaultConfig(ctx)
if err != nil {
    return nil, fmt.Errorf("failed to load SDK config: %s", err)
}
s3Client := s3.NewFromConfig(sdkConfig)

var bucket string
var key string
for _, record := range event.Records {
    bucket = record.S3.Bucket.Name
    key = record.S3.Object.URLDecodedKey

    // Get the object
    getObjectOutput, err := s3Client.GetObject(ctx, &s3.GetObjectInput{
        Bucket: &bucket,
        Key:    &key,
    })
    if err != nil {
        return nil, fmt.Errorf("failed to get object %s/%s: %s", bucket, key, err)
    }
    defer getObjectOutput.Body.Close()
...
}

Line 1: If you have ever used AWS SDKs before, you might have seen the credential chaining operation. AWS SDK can use different methods to resolve credentials to create a session to connect AWS services. If you don't pass anything as credentials, it will try to find the credentials in the environment variables. If it cannot find it, then it will use the AWS metadata to understand the identity. In AWS Lambda environment, it knows how to resolve indentity to construct a session in Go.

Line 14: In this part, we will get the object from S3 bucket. We will be using this object to decode image details to get EXIF information.

Extracting EXIF Data

main.go
buf := new(bytes.Buffer)
_, err = buf.ReadFrom(getObjectOutput.Body)
if err != nil {
    return nil, fmt.Errorf("failed to read object %s/%s: %s", bucket, key, err)
}

// Check EXIF data
exifData, err := exif.Decode(buf)
if err != nil {
    return nil, fmt.Errorf("failed to decode EXIF data: %s", err)
}

log.Printf("successfully retrieved %s/%s with EXIF DateTime: %v", bucket, key, exifData)

Line 2: Create a reader from S3 object contents to use for decoding EXIF data.

Line 8: Extract EXIF data from image

Store in Postgres Database

There are lots of information in image headers, but in our case we will use 2 fields: make and model.

main.go
// SQL statement
sqlStatement := `INSERT INTO images (bucket, key, model, company) VALUES ($1,$2,$3,$4)`

// Execute the insertion
model, err := exifData.Get(exif.Model)
if err != nil {
    return nil, fmt.Errorf("failed to get model: %s", err)
}
company, err := exifData.Get(exif.Make)
if err != nil {
    return nil, fmt.Errorf("failed to get company: %s", err)
}
_, err = db.Exec(sqlStatement, bucket, key, model.String(), company.String())
if err != nil {
    return nil, fmt.Errorf("failed to execute SQL statement: %s", err)
}

We basically read the EXIF data and insert it into the database. You can use following to create images table in your database.

CREATE TABLE images (
    bucket varchar(255),
    key varchar(255),
    model varchar(255),
    company varchar(255)
);

bucket - S3 bucket name key - S3 object key model - Model name of the camera used to take the image company - Company name of the camera used to take the image

Now that we implemented our image metadata extraction, let's take a look at how we can deploy this function to AWS Lambda.

Deployment

Preparing Artifact

There is a reason to have a main function in our functions since we are about to build an executable to pass as bootstrap entrypoint to AWS Lambda environment. We need to build an executable, zip it and upload it to AWS Lambda as a new function.

GOOS=linux GOARCH=arm64 go build -tags lambda.norpc -o bootstrap main.go

We build an executable for linux OS and ARM64 architecture by using the main.go as an entrypoint. We use lambda.norpc tag to exclude the RPC library from the executable. This will prevent the RPC library from being included in the executable. This is only used if you are using 1.X Go runtime. Also, we named the executable as bootstrap, this is the entrypoint for AWS Lambda. It will not be executed if you use another name. Finally, we will zip the executable and upload it to AWS Lambda as a new function.

zip PhotoHandler.zip bootstrap

AWS Requirements

Once we deploy the function, it will require set of permission like;

Accessing S3 buckets
Being able to create log groups in CloudWatch
Being able to write to CloudWatch logs We can create an AWS role with the following policy for this purpose and assign it to the Lambda function

trust-policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "logs:CreateLogGroup",
      "Resource": "arn:aws:logs:<region>:<account-id>:*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "lambda:InvokeFunction"
      ],
      "Resource": [
        "arn:aws:logs:<region>:<account-id>:log-group:/aws/lambda/PhotoHandler:*",
        "arn:aws:lambda:<region>:<account-id>:function:PhotoHandler"
      ]
    },
    {
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "*"
    }
  ]
}

Line 7: This part is used to create CloudWatch log group and log stream. Do not forget to use your region and account ID. You can grab the account id with the following command

aws sts get-caller-identity

Line 17-18: This section contains another set of permission for creating log events, also invoking specific function which is PhotoHandler in our case. Again, do not forget to replace region and account id in your case.

Line 23: This section contains the permission to access S3 buckets.

Now you can store this as trust-policy.json and execute the following command to create role.

aws iam create-role \
  --role-name photo-handler \
  --assume-role-policy-document \
  file://trust-policy.json

Remember this role name since we will use it on AWS Lambda function creation.

AWS Lambda Function Creation

You can create a new lambda function as follows.

aws lambda create-function \
  --function-name PhotoHandler \
  --runtime provided.al2023 \
  --handler bootstrap \
  --architectures arm64 \
  --role arn:aws:iam::<account-id>:role/photo-handler \
  --zip-file fileb://PhotoHandler.zip

Line 3: This is the OS only environment, since we already have binary executable, so this can be provided to this env as entrypoint.

Line 6: Do not forget to replace with your account id, this part is needed for binding role to this specific function. This execution runtime will be able to do the operation provided in the trust policy that we created role out of it in previous section.

Adding S3 Events Trigger

In this section, we will add a trigger for S3 event so that this lambda function will be invoked whenever you upload new image to specific S3 bucket.

s3-notification.json
{
  "LambdaFunctionConfigurations": [
    {
      "LambdaFunctionArn": "arn:aws:lambda:<region>:<account-id>:function:PhotoHandler",
      "Events": [
        "s3:ObjectCreated:*"
      ],
      "Filter": {
        "Key": {
          "FilterRules": [
            {
              "Name": "prefix",
              "Value": "acme-images/"
            },
            {
              "Name": "suffix",
              "Value": ".jpeg"
            }
          ]
        }
      }
    }
  ]
}

Now you can configure your bucket for the notifications so that it will trigger this lambda function.

aws s3api put-bucket-notification-configuration \
    --bucket acme-images \
    --notification-configuration file://s3-notification.json

This configure will ensure sending notification about S3 object creation events to trigger AWS lambda function. This event can be consumed inside the HandleRequest function.

Last Step

Now that we added a trigger to lambda function for S3 events. Whenever you add a new jpeg file to acme-images bucket it will invoke lambda function and it will get EXIF data then finally store in PostgreSQL database.

Conclusion

In this article, we explored how to automate image metadata extraction using AWS Lambda, Go, and PostgreSQL. We demonstrated how to use AWS Lambda to handle S3 events, extract EXIF data from images using the exif package in Go, and store the extracted metadata in a PostgreSQL database using the Rapidapp, PostgreSQL As a Service. There will be more Serverless use-cases in the future, do not forget to subscribe for new articles.

tip

You can find the complete source code for this project on GitHub.

Integrating Spring AI with Vector Databases - A Guide Using PGVector

June 27, 2024 · 7 min read

Huseyin BABAL

Software Developer

What is a Vector Database?

A vector database is a specialized type of database optimized for storing, retrieving, and performing operations on vector data. Vectors, in this context, are typically arrays of numerical values that represent data in a multi-dimensional space. These are widely used in machine learning and AI for tasks like similarity search, where the goal is to find data points that are close to a given query point in this multi-dimensional space. Vector databases provide efficient indexing and querying capabilities for such operations, often leveraging advanced mathematical and computational techniques to ensure fast and accurate results.

What is PGVector?

pgvector is an extension for PostgreSQL that adds support for storing and querying vector data. It allows users to leverage PostgreSQL's powerful database capabilities while adding specialized functionality for vector operations. With pgvector, you can store high-dimensional vectors, perform similarity searches, and integrate vector operations seamlessly with your existing PostgreSQL databases.

tip

Create a free database with pgvector support in Rapidapp in seconds here

How Spring Integrates with Vector Databases

Spring, a popular framework for building Java applications, provides robust support for integrating with various types of databases, including vector databases like pgvector. Using Spring AI PGVector Store, developers can easily manage data access and integrate vector operations into their applications. Spring AI offers additional capabilities to enhance machine learning and AI integrations, making it a powerful choice for applications that require advanced data handling and analytics.

Creating a Spring Project

To get started, we'll create a new Spring project. This can be done using Spring Initializr or any other method you prefer. For simplicity, we'll use Spring Initializr here.

Navigate to Spring Initializr: Open your browser and go to Spring Initializr.
Project Settings: Set the following options:
- Project: Maven Project
- Language: Java
- Spring Boot: (select the latest stable version)
- Dependencies: Add Spring Web Download the project and unzip it. Open the pom.xml file and add the following dependencies to the dependencies section:

pom.xml
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-transformers-spring-boot-starter</artifactId>
</dependency>

Application YAML Configuration

Next, we need to configure our application to use PostgreSQL as a vector store. Update your application.yaml file as follows:

application.yaml
spring:
  datasource:
    url: jdbc:postgresql://<host>:<port>/<db>?application_name=rapidapp_spring_ai
    username: <user>
    password: <password
  ai:
    ollama:
      embedding:
        enabled: false
    vectorstore:
      pgvector:
        index-type: hnsw
        distance-type: cosine_distance
        dimensions: 384

index-type: Specifies the type of index to be used for vector data. Common options include ivfflat and hnsw.

dimension: Indicates the dimensionality of the vectors being stored.

distance-type: Defines the distance metric used for similarity search, such as l2 (Euclidean distance) or ip (Inner Product).

Index Types

You can see the brief descriptions of index types used in pgvector below, but if you want to know more, you can refer here

HNSW

HNSW (Hierarchical Navigable Small World) is an advanced indexing algorithm designed for efficient approximate nearest neighbor search in high-dimensional spaces. It builds a graph structure where each node represents a vector, and edges represent connections to other vectors. The graph is navigable through multiple layers, allowing for fast and scalable searches by traversing the most relevant nodes. HNSW is known for its high accuracy and low search latency, making it suitable for real-time applications requiring quick similarity searches.

IVF Flat

IVF Flat (Inverted File Flat) is a popular indexing method that partitions the vector space into clusters using a coarse quantizer. Each vector is assigned to a cluster, and an inverted list is maintained for each cluster containing the vectors assigned to it. During a search, only the clusters closest to the query vector are examined, significantly reducing the number of comparisons needed. IVF Flat provides a good balance between search speed and accuracy, and it is especially effective when dealing with large datasets, as it limits the scope of the search to relevant clusters.

Distance Types

Distance types are metrics used to measure the similarity or dissimilarity between vectors in a vector database. Different applications and data types may require different distance metrics to ensure accurate and meaningful results. Here are some commonly used distance types

Euclidean Distance (L2)

This is the most widely used distance metric, measuring the straight-line distance between two points in a multi-dimensional space. It's calculated as the square root of the sum of the squared differences between corresponding elements of the vectors. Euclidean distance is suitable for general-purpose similarity searches and is often used in clustering algorithms.

Cosine Similarity

This metric measures the cosine of the angle between two vectors, providing a value between -1 and 1. Cosine similarity is particularly useful when the magnitude of the vectors is not important, focusing instead on the direction. It's commonly used in text mining and natural language processing to measure the similarity of documents or word embeddings.

Inner Product (Dot Product)

This metric calculates the sum of the products of corresponding elements of two vectors. It's often used in neural networks and machine learning models to measure the alignment between vectors. Inner product similarity is useful when comparing vectors where higher values indicate greater similarity.

Manhattan Distance (L1)

Also known as the city block distance, it measures the sum of the absolute differences between corresponding elements of two vectors. Manhattan distance is useful in scenarios where differences in individual dimensions are more significant than the overall geometric distance, such as in certain types of image processing.

Hamming Distance

This metric counts the number of positions at which the corresponding elements of two vectors are different. It's mainly used for binary vectors or strings of equal length, making it suitable for applications in error detection and correction, as well as DNA sequence analysis.

Choosing the right distance type depends on the specific requirements of your application and the nature of your data. Each distance metric has its strengths and weaknesses, and understanding these can help optimize the performance and accuracy of similarity searches in your vector database.

Implementing a Document Controller

Create a new controller that will manage vector data operations. Start by defining a service to handle vector store interactions.

DocumentController.java
@RestController
@RequestMapping("/documents")
class DocumentController {

    @Autowired
    private VectorStore vectorStore;

    @PostMapping
    public void create(@RequestBody CreateDocumentRequest request) {
        vectorStore.add(List.of(new Document(request.text(), request.meta())));
    }

    @GetMapping
    public String list(@RequestParam("query") String query) {
        List<Document> results = vectorStore.similaritySearch(SearchRequest.query(query).withTopK(5));
        return results.toString();
    }
}

Create Documents

# Document 1
curl \
    -H "Content-Type: application/json" \
    -d '{"text": "Prometheus collects metrics from targets by scraping metrics HTTP endpoints. Since Prometheus exposes data in the same manner about itself, it can also scrape and monitor its own health.", "meta": {"category": "getting-started"}}' \
    http://localhost:8080/documents

# Document 2
curl \
    -H "Content-Type: application/json" \
    -d '{"text": "Prometheus local time series database stores data in a custom, highly efficient format on local storage.", "meta": {"category": "storage"}}' \
    http://localhost:8080/documents

Search Documents

curl http://localhost:8080/documents?query="scrape"

Conclusion

Integrating Spring AI with vector databases like pgvector provides powerful capabilities for handling vector data and performing advanced similarity searches. By leveraging Spring's robust framework and pgvector's specialized vector operations, developers can build sophisticated applications that effectively manage and analyze high-dimensional data. Rapidapp further enhances this setup with its user-friendly interface and built-in vector store support, making it easier than ever to develop and maintain vector-based applications..

tip

You can find the complete source code for this project on GitHub.

Building an Application with JHipster, PostgreSQL, and Elasticsearch in 10 Minutes

June 19, 2024 · 7 min read

Huseyin BABAL

Software Developer

Introduction

In today’s fast-paced development landscape, creating robust and scalable applications quickly is essential. Leveraging jHipster, PostgreSQL, and Elasticsearch can streamline this process. This article walks you through the steps of building a demo project, showcasing the integration of these powerful tools in just 10 minutes.

Why jHipster?

jHipster accelerates application development by providing a complete stack, including front-end and back-end technologies. It generates high-quality code, follows best practices, and offers extensive tooling, making it a go-to solution for developers seeking efficiency and reliability.

Prerequisites

PostgreSQL

tip

Create a free database in Rapidapp in seconds here

Elasticsearch

We will be using Elasticsearch for the search engine. Elasticsearch is an open-source, distributed, and scalable search engine which you can deploy on-premises or in the cloud. You can use Elastic Cloud if you don't want to maintain your own instance.

jHipster CLi

To get started with jHipster, you'll need to install jHipster CLi.

Getting Started

You can simply run jhipster command in your terminal and follow the prompts to get started as shown below. Do not forget to provide your own namings to fields like application name, package name etc.

? What is the base name of your application? demo
? Which *type* of application would you like to create? Monolithic application (recommended for simple projects)
? What is your default Java package name? com.huseyinbabal.demo
? Would you like to use Maven or Gradle for building the backend? Maven
? Do you want to make it reactive with Spring WebFlux? No
? Which *type* of authentication would you like to use? JWT authentication (stateless, with a token)
? Besides JUnit, which testing frameworks would you like to use?
? Which *type* of database would you like to use? SQL (H2, PostgreSQL, MySQL, MariaDB, Oracle, MSSQL)
? Which *production* database would you like to use? PostgreSQL
? Which *development* database would you like to use? PostgreSQL
? Which cache do you want to use? (Spring cache abstraction) Ehcache (local cache, for a single node)
? Do you want to use Hibernate 2nd level cache? Yes
? Which other technologies would you like to use? Elasticsearch as search engine
? Which *framework* would you like to use for the client? React
? Besides Jest/Vitest, which testing frameworks would you like to use?
? Do you want to generate the admin UI? Yes
? Would you like to use a Bootswatch theme (https://bootswatch.com/)? Default JHipster
? Would you like to enable internationalization support? No
? Please choose the native language of the application English

This will generate a full-stack application where PostgreSQL and Elasticsearch is configured and enabled during application application startup. Now that you have the basic setup, you can start configuring the datasource.

PostgreSQL Configuration

Once you open project folder in your favourite IDE, you can see the generated application*.yaml files under src/main/resources/config folder. Since we are doing local development for now, you can open application-dev.yaml and configure datasource as follows.

application-dev.yaml
spring:
  datasource:
    url: jdbc:postgresql://<host>:<port>/<db_name>?sslmode=require&application_name=rapidapp_jhipster # You can find details on Rapidapp db details page.
    username: <username>
    password: <password>

Line 3: You can find the DB connection details on Rapidapp db details page.

Elasticsearch Configuration

We will configure Elasticsearch as a search engine in our application. Once you create your own Elasticsearch instance, or create one in Elastic Cloud, note your Elasticsearch credentials to use them in the following configuration section.

application-dev.yaml
spring:
  elasticsearch:
    uris: https://elastic:<password>@<host>:<port>

Running the Application

Now that you have configured your database and search engine, you can start the application with the following command:

./mvn

Above command will do the followings;

Build the frontend and backend projects
Start the backend project while running liquibase asynchronously. Liquibase will prepare the database schema by using your entities.
Start the frontend project. If everything goes well, you will see an output as follows;

2024-06-19T17:01:11.056+03:00  INFO 65684 --- [  restartedMain] com.huseyinbabal.jdemo.JDemoApp          :
----------------------------------------------------------
	Application 'jDemo' is running! Access URLs:
	Local: 		http://localhost:8080/
	External: 	http://192.168.1.150:8080/
	Profile(s): 	[dev, api-docs]
----------------------------------------------------------

You can simply navigate to http://localhost:8080/ to access the application. It will show you the default credentials for users with admin and user rights, you can login with admin:admin credentials to see how admin UI looks like. You can see the critical components below;

Entities: Entities used in this application. We will see this soon to create our own entities to use in the application.
Administration > Metrics: You can see several metrics like JVM, Cache, HTTP statistics.
Administration > Health: You can see the health information of the application like db, disk health.
Administration > Logs: You can see the log configuration of the application where you can set log level in root or package level. Feel free to walk through the menus in Admin UI menu to get familiar with them, meanwhile, let's see how we can add our own entities to application.

Adding Entities

We will add our own entities to the application. Let's create a new entity called Product and add it to the application with the following command

jhipster entity product

It will prompt you to add fields for this entity. You can use following fields;

title: String
description: String
price: Float Once it is done, it will create necessary entity in codebase and related controller for the CRUD operations. src/main/java/<package>/domain/Product.java contains the generated entity class and src/main/java/<package>/repository/ProductRepository.java contains the generated repository class. In order to access resource information which is the presentation layer, you can take a look at src/main/java/<package>/web/rest/ProductResource.java. Let's take a look at how it creates a product as shown below.

ProductResource.java
@PostMapping("")
public ResponseEntity<Product> createProduct(@RequestBody Product product) throws URISyntaxException {
    log.debug("REST request to save Product : {}", product);
    if (product.getId() != null) {
        throw new BadRequestAlertException("A new product cannot already have an ID", ENTITY_NAME, "idexists");
    }
    product = productRepository.save(product);
    productSearchRepository.index(product);
    return ResponseEntity.created(new URI("/api/products/" + product.getId()))
        .headers(HeaderUtil.createEntityCreationAlert(applicationName, false, ENTITY_NAME, product.getId().toString()))
        .body(product);
}

Line 8: productSearchRepository.index(product) is used to index the product in Elasticsearch. You see how easy it is to store product data in elasticsearch. We haven't written any code for that, but since we have added elasticsearch config, jHipster becomes an elasticsearch-aware system where it also generates required functions for you.

In same way, let's see how it searches product as shown below.

ProductResource.java
@GetMapping("/_search")
public ResponseEntity<List<Product>> searchProducts(
    @RequestParam("query") String query,
    @org.springdoc.core.annotations.ParameterObject Pageable pageable
) {
    log.debug("REST request to search for a page of Products for query {}", query);
    try {
        Page<Product> page = productSearchRepository.search(query, pageable);
        HttpHeaders headers = PaginationUtil.generatePaginationHttpHeaders(ServletUriComponentsBuilder.fromCurrentRequest(), page);
        return ResponseEntity.ok().headers(headers).body(page.getContent());
    } catch (RuntimeException e) {
        throw ElasticsearchExceptionMapper.mapException(e);
    }
}

Line 8: productSearchRepository.search(query, pageable) is used to search product in elasticsearch.

Product Entity in Admin UI

As you can see, we have created a product entity in Admin UI. You can see it in Entities menu. Once you navigate to Product module, you can create new Product, list products, view details of a product or delete any of them. Also, there is a search bar where you can search products with the help of Elasticsearch on the backend side.

Conclusion

We have seen that you can leverage jHipster's rapid development capabilities along with the robust data management of PostgreSQL and powerful search functionality of Elasticsearch. jHipster simplifies and accelerates application creation with its comprehensive toolset and best practices, while PostgreSQL ensures reliable and efficient data handling. Elasticsearch adds advanced search capabilities, making your application both scalable and responsive. Utilizing Rapidapp's PostgreSQL as a service further streamlines database management, allowing you to focus on developing high-quality applications quickly and effectively.

tip

You can find the complete source code for this project on GitHub.

Introduction​

What is Javers?​

Use Cases for Javers​

Understanding Javers' Tables​

Persistence Layer​

Step-by-Step Implementation​

Dependencies​

Spring Web Security Configuration​

Add the Product Entity​

Implement the Product Repository​

Implement the Product Controller​

Application Configuration​

Demo​

Create Product​

Update Product​

Conclusion​

Introduction​

Prerequisites​

Step-by-Step Implementation​

Setting Up Your Laravel Project​

Configuring PostgreSQL Database​

Creating the Task Model and Migration​

Defining the Database Schema​

Implementing the Task Model​

Setting Up the API Routes​

Creating the TaskController​

Running the Laravel Application​

Disabling CSRF for API Routes​

Testing the API​

List All Tasks​

Create a New Task​

Get a Specific Task​

Update a Task​

Delete a Task​

Conclusion​

Introduction: Why Database Migrations Matter?​

What is Flyway?​

Step-by-Step Guide to Use Flyway Migrations​

Project Initialization and Dependencies​

Application Configuration​

Create Migration Files​

Running Application​

Best Practices for Migration Files​

Understanding the Flyway Metadata Table​

Conclusion​

Introduction​

Prerequisites​

Getting Started​

DB Migration with Diesel​

Implement Axum Server​

Implementing Handlers​

Create Todo Handler​

List Todos Handler​

Get Todo Handler​

Update Todo Handler​

Delete Todo Handler​

Demo Time​

Create a todo​

List all todos​

Get a specific todo​

Update a todo​

Delete a todo​

Conclusion​

Introduction: Why Send Changes to Kafka​

Use Cases for Publishing Database Changes to Kafka​

What is Debezium?​

PostgreSQL Configuration: Logical WAL Replication​

Deploying Debezium Connect with PostgreSQL Connection​

Adding Debezium Connector​

Step-by-Step Spring Boot Application Setup​

Project Initialization and Dependencies​

Implementing Entity and Repository​

Implementing Rest Endpoints​

Application Configuration​

Running Application​

Demo​

Create Product​

Delete Product​

Conclusion​

Introduction to Geospatial Data​

Introduction

What is Javers?

Use Cases for Javers

Understanding Javers' Tables

Persistence Layer

Step-by-Step Implementation

Dependencies

Spring Web Security Configuration

Add the Product Entity

Implement the Product Repository

Implement the Product Controller

Application Configuration

Demo

Create Product

Update Product

Conclusion

Introduction

Prerequisites

Step-by-Step Implementation

Setting Up Your Laravel Project

Configuring PostgreSQL Database

Creating the Task Model and Migration

Defining the Database Schema

Implementing the Task Model

Setting Up the API Routes

Creating the TaskController

Running the Laravel Application

Disabling CSRF for API Routes

Testing the API

List All Tasks

Create a New Task

Get a Specific Task

Update a Task

Delete a Task

Conclusion

Introduction: Why Database Migrations Matter?

What is Flyway?

Step-by-Step Guide to Use Flyway Migrations

Project Initialization and Dependencies

Application Configuration

Create Migration Files

Running Application

Best Practices for Migration Files

Understanding the Flyway Metadata Table

Conclusion

Introduction

Prerequisites

Getting Started

DB Migration with Diesel

Implement Axum Server

Implementing Handlers

Create Todo Handler

List Todos Handler

Get Todo Handler

Update Todo Handler

Delete Todo Handler

Demo Time

Create a todo

List all todos

Get a specific todo

Update a todo

Delete a todo

Conclusion

Introduction: Why Send Changes to Kafka

Use Cases for Publishing Database Changes to Kafka

What is Debezium?

PostgreSQL Configuration: Logical WAL Replication

Deploying Debezium Connect with PostgreSQL Connection

Adding Debezium Connector

Step-by-Step Spring Boot Application Setup

Project Initialization and Dependencies

Implementing Entity and Repository

Implementing Rest Endpoints

Application Configuration

Running Application

Demo

Create Product

Delete Product

Conclusion

Introduction to Geospatial Data