Skip to main content

Building a Todo API with Rust - A Step-by-Step Guide Using Axum and Diesel

· 7 min read
Huseyin BABAL
Software Developer

Introduction

In the world of web development, performance and safety are paramount. Rust, with its emphasis on speed and memory safety, has emerged as a powerful language for building robust web applications. Today, we'll explore how to create a high-performance RESTful API for a Todo application using Rust, along with two of its most popular libraries: Axum for web services and Diesel for ORM. Rust: A systems programming language that runs blazingly fast and prevents segfaults. Axum: A web application framework that focuses on ergonomics and modularity. Diesel: A safe, extensible ORM and Query Builder for Rust.

Prerequisites

  • Rust
  • Cargo for package management
  • Diesel
  • PostgreSQL In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.
tip

Create a free database in Rapidapp in seconds here

Getting Started

You can initialize the project and add required dependencies as follows;

# Intialize the project
cargo new todo-rs
cd todo-rs
# Add dependencies
cargo add \
axum \
tokio \
serde \
serde_json \
diesel \
dotenvy \
-F tokio/full,serde/derive,diesel/postgres,diesel/r2d2

cargo add is used for dependencies, and if you also add modules for specific crate (package), then we use -F param. For example, if we want to include postgres feature of diesel, the notation will be diesel/postgres. Above command will populate Cargo.toml file as follows;

[package]
name = "todo-rs"
version = "0.1.0"
edition = "2021"

[dependencies]
axum = "0.7.5"
axum-macros = "0.4.1"
diesel = { version = "2.2.2", features = ["postgres", "r2d2"] }
dotenvy = "0.15.7"
serde = { version = "1.0.204", features = ["derive"] }
serde_json = "1.0.122"
tokio = { version = "1.39.2", features = ["full"] }

DB Migration with Diesel

You can initialize the migration for your project for the first time with the following;

diesel setup

This will create migrations folder and diesel.toml in project root folder. In this article, we will implement a Todo REST API, and the only business model we have is todos. In order to generate migrations for todos entity, we can use following command.

diesel migration generate create_todos_table

This will generate a dedicated migration folder for todos table creation. When you open migration folder, you will se there is up.sql and down.sql files. up.sql is executed once we run the migration to apply DB changes. down.sql is used once we revert the DB changes. We are responsible for those sql file as you can see below.

up.sql
CREATE TABLE todos (
id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
content TEXT NOT NULL
);
down.sql
DROP TABLE todos;

Now we can run diesel migration run to apply migrations. This will create the table and also will create a src/schema.rs file contains mapped struct for todo entity as follows

src/schema.rs
// @generated automatically by Diesel CLI.

diesel::table! {
todos (id) {
id -> Int4,
title -> Text,
content -> Text,
}
}

You can clearly see, it is generated bt Diesel and you shouldn't manually configure it.

Implement Axum Server

In this section we will be implementing that part responsible for running an HTTP server with axum. This server will expose the handlers for the CRUD operations of Todo entity.

src/main.rs
#[tokio::main]
async fn main() {
dotenv().ok();
let database_url = env::var("DATABASE_URL").expect("DATABASE_URL must be set");
let manager = ConnectionManager::<PgConnection>::new(database_url);
let pool = r2d2::Pool::builder()
.max_size(5)
.build(manager)
.expect("Failed to create pool.");
let db_connection = Arc::new(pool);

let app = Router::new()
.route("/todos", post(handlers::create_todo))
.route("/todos", get(handlers::get_todos))
.route("/todos/:id", get(handlers::get_todo))
.route("/todos/:id", post(handlers::update_todo))
.route("/todos/:id", delete(handlers::delete_todo))
.with_state(db_connection.clone());

let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await.unwrap();
let server = axum::serve(listener, app).with_graceful_shutdown(shutdown_signal());

tokio::spawn(async move {
println!("Server is running");
});

if let Err(e) = server.await {
eprintln!("Server error: {}", e);
}
}

Line 4-10: Set up the database connection pool

Line 12-18: Define the routes for our API

Line 20-21: Set up the server address

Line 23-30: Log application startup or failure

src/main.rs is the file we mostly do our global initializations like database connection pooling setup or preparing REST endpoints. Now that we have endpoints for the Todo entity, let's implement the real logic of those handlers.

Implementing Handlers

Create Todo Handler

In this handler, we accept NewTodo request and will create new record in database. In axum handlers, you can see a state beside request body and they are used for passing dependencies like database connection pools to use for db operations.

src/handlers.rs
pub async fn create_todo(
State(db): State<DbPool>,
Json(new_todo): Json<NewTodo>,
) -> (StatusCode,Json<Todo>) {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let todo = diesel::insert_into(todos::table)
.values(&new_todo)
.get_result(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

(StatusCode::CREATED, Json(todo))
}

Line 2: Accept db connection pool as dependency

Line 3: Request body as NewTodo

Line 5: Get available connection from DB connection pool, throw error otherwise.

Line 7: Insert new_todo in todos table

Line 12: Return CREATED status code and new todo item as response body

List Todos Handler

src/handlers.rs
pub async fn get_todos(
State(db): State<DbPool>,
) -> (StatusCode,Json<Vec<Todo>>) {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let results = todos::table.load::<Todo>(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

(StatusCode::OK, Json(results))
}

This time, we don't expect to see something in body, we just return todos items by using load function and cast them to Todo struct. As always, return results in response body with status code OK

Get Todo Handler

We get the todo id from path params and do a query to todos table by filtering id as follows

src/handlers.rs
pub async fn get_todo(
Path(todo_id): Path<i32>,
State(db): State<DbPool>,
) -> (StatusCode,Json<Todo>) {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let result = todos::table.filter(id.eq(todo_id)).first::<Todo>(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

(StatusCode::OK, Json(result))
}

Update Todo Handler

In this handler, we accept update payload from end user and update existing Todo by resolving the id from path params.

src/handlers.rs
pub async fn update_todo(
Path(todo_id): Path<i32>,
State(db): State<DbPool>,
Json(update_todo): Json<UpdateTodo>,
) -> (StatusCode,Json<Todo>) {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let todo = diesel::update(todos::table.filter(id.eq(todo_id)))
.set(&update_todo)
.get_result(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

(StatusCode::OK, Json(todo))
}

Delete Todo Handler

As you guess, we resolve todo id from path params then execute delete query against todo table as follows.

src/handlers.rs
pub async fn delete_todo(
Path(todo_id): Path<i32>,
State(db): State<DbPool>,
) -> StatusCode {
let mut conn = db.get().map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

let _ =diesel::delete(todos::table.filter(id.eq(todo_id)))
.execute(&mut conn)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR).unwrap();

StatusCode::NO_CONTENT
}

Demo Time

Right after you set environment variable DATABASE_URL, you can run application as follows;

cargo run

Here are some Todo operations

Create a todo

curl -X POST -H "Content-Type: application/json" -d '{"title":"Buy groceries","content":"banana,milk"}' http://localhost:8080/todos

List all todos

curl http://localhost:8080/todos

Get a specific todo

curl http://localhost:8080/todos/1

Update a todo

curl -X POST -H "Content-Type: application/json" -d '{"title":"Buy Groceries", "content": "banana"}' http://localhost:8080/todos/1

Delete a todo

curl -X DELETE http://localhost:8080/todos/1

Conclusion

We've successfully built a Todo API using Rust, Axum, and Diesel. This combination provides a robust, safe, and efficient backend for web applications. The strong typing of Rust, combined with Diesel's compile-time checked queries and Axum's ergonomic routing, creates a powerful foundation for building scalable web services. By leveraging Rust's performance and safety features, we can create APIs that are not only fast but also resistant to common runtime errors. As you continue to explore Rust for web development, you'll find that this stack provides an excellent balance of developer productivity and application performance. Remember, this is just the beginning. You can extend this API with authentication, more complex queries, and additional features to suit your specific needs. Happy coding!

tip

You can find the complete source code for this project on GitHub.

Streaming PostgreSQL Changes to Kafka with Debezium

· 8 min read
Huseyin BABAL
Software Developer

Introduction: Why Send Changes to Kafka

In modern distributed systems, keeping multiple services in sync and maintaining data consistency across microservices can be challenging. When dealing with microservices architecture, it's crucial to have an efficient way to propagate changes in database to other services in real-time. One effective solution is to publish database changes to message broker like Apache Kafka. Kafka acts as an intermediary that allows various services to subscribe to these changes and react accordingly. This approach ensures real-time data synchronization, reduces the complexity of direct service-to-service communication, and enhances the overall scalability and fault tolerance of the system.

Use Cases for Publishing Database Changes to Kafka

  • Real-Time Analytics: Feeding database changes to a real-time analytics system to provide up-to-the-minute insights.
  • Event-Driven Architecture: Enabling services to react to database changes, triggering workflows or business processes.
  • Cache Invalidation: Automatically invalidating or updating cache entries based on database changes to ensure consistency.
  • Data Replication: Replicating data across different data stores or geographic regions for redundancy and high availability.
  • Audit Logging: Keeping a comprehensive audit log of all changes made to database for compliance and debugging purposes.

What is Debezium?

Debezium is an open-source distributed platform that captures database changes and streams them to Kafka in real-time. It leverages the database's transaction log to detect changes and publish them as events in Kafka topics. Debezium supports various databases, including PostgreSQL, MySQL, and MongoDB, making it a versatile choice for change data capture (CDC) needs.

PostgreSQL Configuration: Logical WAL Replication

In this article, we will be using PostgreSQL as our database with logical WAL replication enabled. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases with built-in logical WAL replication, simplifying setup and maintenance.

tip

Create a free database with built-in logical WAL replication in Rapidapp in seconds here

If you choose to maintain your own PostgreSQL database, you can enable logical WAL replication with following PostgreSQL configuration.

postgresql.conf
...
wal_level = logical
...

You can see more details about WAL Level in PostgreSQL Documentation.

Deploying Debezium Connect with PostgreSQL Connection

There are several ways to deploy Debezium Connect, but we will use Docker for spin up a container to run Debezium Connect as follows.

docker run --rm --name debezium \
-e BOOTSTRAP_SERVERS=<bootstrap_servers> \
-e GROUP_ID=1 \
-e CONFIG_STORAGE_TOPIC=connect_configs \
-e OFFSET_STORAGE_TOPIC=connect_offsets \
-e STATUS_STORAGE_TOPIC=connect_statuses \
-e ENABLE_DEBEZIUM_SCRIPTING='true' \
-e CONNECT_SASL_MECHANISM=SCRAM-SHA-256 \
-e CONNECT_SECURITY_PROTOCOL=SASL_SSL \
-e CONNECT_SASL_JAAS_CONFIG='org.apache.kafka.common.security.scram.ScramLoginModule required username="<username>" password="<password>";' \
-p 8083:8083 debezium/connect:2.7

BOOTSTRAP_SERVERS: You can set bootstrap server for this env variable. You can find this on Upstash dashboard if you are using their managed Kafka.

CONNECT_SASL_JAAS_CONFIG: This part contains security module and username/password pair. You don't need to set this if you are not using Kafka with authentication. However, if you are using Kafka from Upstash, then you can find username and password values on Kafka cluster details page.

CONFIG_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector properties.

OFFSET_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector offsets.

STATUS_STORAGE_TOPIC: This environment variable is used to specify the Kafka topic where Debezium will store the connector statuses.

Debezium connect is ready, but it is empty which means, no source will be tracked which is PostgreSQL, and no data will be sent to sink which is Kafka in our case.

We will also leverage two SaaS solutions:

  • Rapidapp for PostgreSQL: To quickly set up and manage our PostgreSQL database.
tip

Create a free database in Rapidapp Starter in seconds here

  • Upstash Redis: A managed Redis service optimized for low-latency data caching.
tip

Create a free Redis database in Upstash here

Adding Debezium Connector

You can add new connector to Debezium Connect by using its REST API as follows.

curl --location 'http://localhost:8083/connectors' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data '{
"name": "postgres-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "<pg_host>",
"database.port": "<pg_port>",
"database.user": "<pg_user>",
"database.password": "<pg_pass>",
"database.dbname": "<pg_db>",
"database.server.id": "<unique_id>",
"table.include.list": "<schema.table_name>",
"topic.prefix": "<pg_topic>",
"plugin.name": "pgoutput",
"kafka.bootstrap.servers": "<kafka_host>:<kafka_port>",
"kafka.topic.prefix": "<kafka_topic_prefix>"
}
}'

Line 7: This is needed to tell Debezium how to connect source.

Line 8-12: PostgreSQL connection properties, if you have used Rapidapp, you can grab details on Connection Properties tab in database details page

Line 13: This is the unique database server id which will be used by Debezium to differentiate th sources.

Line 14: This is the list of tables that will be monitored by Debezium.

Line 16: This field is used to tell Debezium which plugin should be used for this connector to serialize/deserialize data from PostgreSQL bin log.

Once the connector is created, you can verify it by listing available connectors with the following;

curl -XGET http://localhost:8083/connectors

Step-by-Step Spring Boot Application Setup

In this section, we will implement a simple Spring Boot CRUD application where whenever you do a modification in PostgreSQL database, it will be synchronized to Kafka automatically. This will be useful especially some other service is interested in those changes. In our case, we will be maintaining Product information in PostgreSQL database. Let's get started!

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build the application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
--dependencies=web,data-jpa,postgresql,lombok \
--type=maven-project \
--javaVersion=21 \
spring-pg-debezium

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Implementing Entity and Repository

We have only one entity here, Product, which will be used to store product information. Let's create a new entity called Product as follows.

@Entity
@Data
@NoArgsConstructor
@AllArgsConstructor
class Product {

@Id
@GeneratedValue
private Long id;

private String title;

@Column(name = "price", precision = 10, scale = 2)
private BigDecimal price;
}

Line 2: Automatically enable getter/setter methods by using Lombok

Line 3: Generate no-arg constructor

Line 4: Generate constructor with all instance variables

Line 13: Define price column that accepts value with 10 digits max and 2 decimal places e.g. 023.99

In order to manage Product entity in database, we will use following repository interface.

interface ProductRepository extends CrudRepository<Product, Integer>{}

Implementing Rest Endpoints

We have one root endpoint /api/v1/products inside one controller and implement 3 actions for create, update, and delete as follows

@RestController
@RequestMapping("/api/v1/products")
@RequiredArgsConstructor
class ProductController {

private final ProductRepository productRepository;

@PostMapping
void create(@RequestBody CreateProductRequest request) {
Product product = new Product();
product.setTitle(request.getTitle());
product.setPrice(request.getPrice());
productRepository.save(product);
}

@PatchMapping("/{id}")
void update(@RequestBody UpdateProductRequest request, @PathVariable("id") Long id) {
Product p = productRepository.findById(id).orElseThrow(() -> new EntityNotFoundException("Product not found"));
p.setPrice(request.getPrice());
productRepository.save(p);
}

@DeleteMapping("/{id}")
void delete(@PathVariable("id") Long id) {
productRepository.deleteById(id);
}
}

create method accepts a request CreateProductRequest which contains title, and price information as shown below.

@Data
@NoArgsConstructor
@AllArgsConstructor
class CreateProductRequest {

private String title;

private BigDecimal price;

}

update is used to update product price, and it accepts a request as follows.

@Data
@NoArgsConstructor
@AllArgsConstructor
class UpdateProductRequest {

private BigDecimal price;

}

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

This section contains application level configurations such as the application name, datasource, and jpa as shown below:

application.yaml
spring:
application:
name: spring-pg-debezium
datasource:
url: <connection-string-from-rapidapp|or your own managed postgres url>
username: <username>
password: <password>
jpa:
database-platform: org.hibernate.dialect.PostgreSQLDialect
hibernate:
ddl-auto: update

Line 5: Connection URL for the PostgreSQL database. You can obtain this from Rapidapp or your own managed PostgreSQL service. It should have a format like jdbc:postgresql://<host>:<port>/<database>?sslmode=require.

Running Application

You can run application as follows

./mvnw spring-boot:run

Demo

Once you perform any of the following request, you will see it will be published to Kafka cluster where you can consume and see the message.

Create Product

curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/products -d '{"title": "Blue Iphone", "price": "37.3213"}'
``

### Update Product
```bash
curl -XPATCH -H "Content-Type: application/json" http://localhost:8080/api/v1/products/1 -d '{"price": "37.1213"}'

Delete Product

curl -XDELETE  http://localhost:8080/api/v1/products/1

Conclusion

Integrating Debezium with PostgreSQL and Kafka in a Spring Boot environment allows you to efficiently stream database changes to various services. This setup not only enhances data consistency and real-time processing capabilities but also simplifies the architecture of your microservices. By following this guide, you can leverage the power of change data capture to build responsive and scalable applications.

tip

You can find the complete source code for this project on GitHub.

Building Location Based Search Service with Spring Boot PostgreSQL and PostGIS

· 12 min read
Huseyin BABAL
Software Developer

Introduction to Geospatial Data

Geospatial data, also known as spatial data, represents the physical location and shape of objects on the Earth's surface. It includes information such as latitude, longitude, altitude, and the spatial relationships between different objects. Geospatial data is used in a wide range of applications, from mapping and navigation to environmental monitoring and urban planning.

Use Cases for Geospatial Data

Geospatial data has numerous applications across various industries. Some common use cases include:

  • Navigation and Routing: GPS systems use geospatial data to provide real-time navigation and routing information.
  • Environmental Monitoring: Track changes in land use, deforestation, and urban sprawl using satellite imagery and geospatial analysis.
  • Urban Planning: Plan infrastructure projects, analyze traffic patterns, and manage public services using geospatial data.
  • Location-Based Services: Deliver personalized content, offers, and services based on a user's location.

Using Geospatial Data in PostgreSQL with PostGIS Extension

In this article, we will be using PostgreSQL as our database with PostGIS extension. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases with built-in postgis extension, simplifying setup and maintenance.

tip

Create a free database with built-in postgis extension in Rapidapp in seconds here

If you choose to maintain your own PostgreSQL database, you can enable PostGIS extension with the following command for each database as shown below.;

CREATE EXTENSION postgis;

Step-by-Step Guide to Creating the Location-Based Search Service

One practical application of geospatial data is a geolocation search application, where users can find nearby points of interest within a specified radius. In this article, we will build a Spring Boot application that searches for cities within specified radius of a given point.

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build the application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
--dependencies=web,data-jpa,postgresql,lombok \
--type=maven-project \
--javaVersion=21 \
spring-postgres-spatial

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

There is one more dependency we need to add to enable spatial feature of hibernate: hibernate-spatial. Open pom.xml and add following dependency to dependencies section.

<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-spatial</artifactId>
<version>6.5.2.Final</version>
</dependency>

Now that we initialized the project, go to the folder spring-postgres-spatial and open it with your favourite IDE.

Implementing Entity and Repository

We have only one entity here, City, which will be used to store city information including its location. Let's create a new entity called City as follows.

@Entity
@Data
@NoArgsConstructor
@AllArgsConstructor
class City {

@Id
@GeneratedValue
private Long id;

private String name;

@Column(columnDefinition = "geography(Point, 4326)")
private Point location;
}

Line 2: Automatically enable getter/setter methods by using Lombok

Line 3: Generate no-arg constructor

Line 4: Generate constructor with all instance variables

Line 13: This is for using special PostGIS data type geography described as follows;

geography: This indicates that the column will use the PostGIS geography data type, which is designed for storing geospatial data in a way that accounts for the Earth's curvature. This type is particularly useful for global, large-scale datasets where you want accurate distance and area calculations.

Point: Specifies that the data type for this column is a geographic point. Points are used to store coordinates (latitude and longitude).

4326: This is the Spatial Reference System Identifier (SRID) for WGS 84, which is the standard coordinate system used by GPS. SRID 4326 ensures that the coordinates are stored in a globally recognized format.

In order to manage City entity in database, we will use following repository interface.

interface CityRepository extends CrudRepository<City, Integer>{
@Query("SELECT c FROM City c WHERE function('ST_DWithin', c.location, :point, :distance) = true")
Iterable<City> findNearestCities(Point point, double distance);
}

ST_DWithin returns true if the geometries are within a given distance. In our case it will return cities which has location in City table is in a distance :distance of :point

Implementing Rest Endpoints

We have one root endpoint /api/v1/cities inside one controller and implement 3 actions for create, list, and find nearest locations as follows

@RestController
@RequestMapping("/api/v1/cities")
@RequiredArgsConstructor
class CityController {

private final CityRepository cityRepository;

private final GeometryFactory geometryFactory;

@PostMapping
void create(@RequestBody CreateCityRequest request) {
Point point = geometryFactory.createPoint(new Coordinate(request.getLng(), request.getLat()));
City city = new City();
city.setName(request.getName());
city.setLocation(point);
cityRepository.save(city);
}

@GetMapping
List<CityDto> findAll() {
List<CityDto> cities = new ArrayList<>();
cityRepository.findAll().forEach(c -> {
cities.add(new CityDto(c.getName(), c.getLocation().getY(), c.getLocation().getX()));
});
return cities;
}

@GetMapping("/nearest")
List<CityDto> findNearestCities(@RequestParam("lat") float lat, @RequestParam("lng") float lng, @RequestParam("distance") int distance) {
List<CityDto> cities = new ArrayList<>();
Point point = geometryFactory.createPoint(new Coordinate(lng, lat));
cityRepository.findNearestCities(point, distance).forEach(c -> {
cities.add(new CityDto(c.getName(), c.getLocation().getY(), c.getLocation().getX()));
});
return cities;
}
}

Line 8: This comes from hibernate-spatial and it is used to do basic conversions between geometric shapes. In our case, we convert latitude-longitude pair to Point which will be used for repository operations.

create method accepts a request CreateCityRequest which contains name, latitude and longitude information as shown below.

@AllArgsConstructor
@NoArgsConstructor
@Data
class CreateCityRequest {

private String name;
private double lat;
private double lng;
}

findAll is used to list all available cities in the database.

findNearestCities is used for finding neighbour cities for a given coordinate and radius (meters).

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

This section contains application level configurations such as the application name, datasource, and jpa as shown below:

application.yaml
spring:
application:
name: spring-postgres-spatial
datasource:
url: <connection-string-from-rapidapp|or your own managed postgres url>
username: <username>
password: <password>
jpa:
database-platform: org.hibernate.dialect.PostgreSQLDialect
hibernate:
ddl-auto: update

Line 5: Connection URL for the PostgreSQL database. You can obtain this from Rapidapp or your own managed PostgreSQL service. It should have a format like jdbc:postgresql://<host>:<port>/<database>?sslmode=require.

Running Application

You can run application as follows

./mvnw spring-boot:run

Demo

Create City

In this section, we will be creating cities of Turkey

Click to see create city requests
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Adana", "lat": "37.0000", "lng": "35.3213"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Adıyaman", "lat": "37.7648", "lng": "38.2786"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Afyonkarahisar", "lat": "38.7507", "lng": "30.5567"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ağrı", "lat": "39.7191", "lng": "43.0503"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Amasya", "lat": "40.6499", "lng": "35.8353"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ankara", "lat": "39.9208", "lng": "32.8541"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Antalya", "lat": "36.8841", "lng": "30.7056"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Artvin", "lat": "41.1828", "lng": "41.8183"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Aydın", "lat": "37.8560", "lng": "27.8416"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Balıkesir", "lat": "39.6484", "lng": "27.8826"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bilecik", "lat": "40.0567", "lng": "30.0665"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bingöl", "lat": "39.0626", "lng": "40.7696"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bitlis", "lat": "38.3938", "lng": "42.1232"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bolu", "lat": "40.5760", "lng": "31.5788"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Burdur", "lat": "37.4613", "lng": "30.0665"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bursa", "lat": "40.2669", "lng": "29.0634"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çanakkale", "lat": "40.1553", "lng": "26.4142"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çankırı", "lat": "40.6013", "lng": "33.6134"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Çorum", "lat": "40.5506", "lng": "34.9556"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Denizli", "lat": "37.7765", "lng": "29.0864"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Diyarbakır", "lat": "37.9144", "lng": "40.2306"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Edirne", "lat": "41.6818", "lng": "26.5623"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Elâzığ", "lat": "38.6810", "lng": "39.2264"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Erzincan", "lat": "39.7500", "lng": "39.5000"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Erzurum", "lat": "39.9000", "lng": "41.2700"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Eskişehir", "lat": "39.7767", "lng": "30.5206"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Gaziantep", "lat": "37.0662", "lng": "37.3833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Giresun", "lat": "40.9128", "lng": "38.3895"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Gümüşhane", "lat": "40.4386", "lng": "39.5086"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Hakkâri", "lat": "37.5833", "lng": "43.7333"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Hatay", "lat": "36.4018", "lng": "36.3498"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Isparta", "lat": "37.7648", "lng": "30.5566"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Mersin", "lat": "36.8000", "lng": "34.6333"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "İstanbul", "lat": "41.0053", "lng": "28.9770"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "İzmir", "lat": "38.4189", "lng": "27.1287"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kars", "lat": "40.6167", "lng": "43.1000"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kastamonu", "lat": "41.3887", "lng": "33.7827"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kayseri", "lat": "38.7312", "lng": "35.4787"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırklareli", "lat": "41.7333", "lng": "27.2167"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırşehir", "lat": "39.1425", "lng": "34.1709"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kocaeli", "lat": "40.8533", "lng": "29.8815"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Konya", "lat": "37.8667", "lng": "32.4833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kütahya", "lat": "39.4167", "lng": "29.9833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Malatya", "lat": "38.3552", "lng": "38.3095"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Manisa", "lat": "38.6191", "lng": "27.4289"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kahramanmaraş", "lat": "37.5858", "lng": "36.9371"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Mardin", "lat": "37.3212", "lng": "40.7245"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Muğla", "lat": "37.2153", "lng": "28.3636"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Muş", "lat": "38.9462", "lng": "41.7539"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Nevşehir", "lat": "38.6939", "lng": "34.6857"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Niğde", "lat": "37.9667", "lng": "34.6833"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ordu", "lat": "40.9839", "lng": "37.8764"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Rize", "lat": "41.0201", "lng": "40.5234"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sakarya", "lat": "40.6940", "lng": "30.4358"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Samsun", "lat": "41.2928", "lng": "36.3313"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Siirt", "lat": "37.9333", "lng": "41.9500"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sinop", "lat": "42.0231", "lng": "35.1531"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Sivas", "lat": "39.7477", "lng": "37.0179"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tekirdağ", "lat": "40.9833", "lng": "27.5167"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tokat", "lat": "40.3167", "lng": "36.5500"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Trabzon", "lat": "41.0015", "lng": "39.7178"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Tunceli", "lat": "39.3074", "lng": "39.4388"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Şanlıurfa", "lat": "37.1591", "lng": "38.7969"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Uşak", "lat": "38.6823", "lng": "29.4082"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Van", "lat": "38.4891", "lng": "43.4089"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Yozgat", "lat": "39.8181", "lng": "34.8147"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Zonguldak", "lat": "41.4564", "lng": "31.7987"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Aksaray", "lat": "38.3687", "lng": "34.0370"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bayburt", "lat": "40.2552", "lng": "40.2249"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Karaman", "lat": "37.1759", "lng": "33.2287"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kırıkkale", "lat": "39.8468", "lng": "33.5153"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Batman", "lat": "37.8812", "lng": "41.1351"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Şırnak", "lat": "37.4187", "lng": "42.4918"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Bartın", "lat": "41.5811", "lng": "32.4610"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Ardahan", "lat": "41.1105", "lng": "42.7022"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Iğdır", "lat": "39.8880", "lng": "44.0048"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Yalova", "lat": "40.6500", "lng": "29.2667"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Karabük", "lat": "41.2061", "lng": "32.6204"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Kilis", "lat": "36.7184", "lng": "37.1212"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Osmaniye", "lat": "37.2130", "lng": "36.1763"}'
curl -XPOST -H "Content-Type: application/json" http://localhost:8080/api/v1/cities -d '{"name": "Düzce", "lat": "40.8438", "lng": "31.1565"}'

List Cities

curl -XGET http://localhost:8080/api/v1/cities

Find Nearest Cities

To find nearest cities of Ankara within a radius 300km, you can use the following.

curl -XGET http://localhost:8080/api/v1/cities/nearest\?lat\=39.9208\&lng\=32.8541\&distance\=300000

Conclusion

In this article, we explored the power of geospatial data and how to effectively utilize it within a Spring Boot application using PostgreSQL with the PostGIS extension. We covered the fundamental concepts of geospatial data, the benefits of using PostGIS for geospatial operations, and real-world use cases such as navigation, environmental monitoring, urban planning, and location-based services.

tip

You can find the complete source code for this project on GitHub.

Create and Deploy Spring Boot Todo App to Google Cloud Run

· 5 min read
Huseyin BABAL
Software Developer

Introduction

In the rapidly evolving world of software development, deploying applications in a scalable and efficient manner is critical. With the rise of cloud computing, services like Google Cloud Run have become essential for developers looking to deploy containerized applications quickly and effortlessly. In this blog post, we'll walk through deploying a simple todo app built with Spring Boot and PostgreSQL to Google Cloud Run. We'll cover setting up the project, integrating PostgreSQL, and deploying to the cloud, ensuring your app is ready to handle varying loads efficiently.

Why Connection Pooling is Essential for Serverless?

When deploying applications in a serverless environment like Google Cloud Run, managing database connections efficiently becomes crucial. Traditional connection management can lead to issues such as exhausting database connections, especially under load. This is where PgBouncer, a lightweight connection pooler for PostgreSQL, comes into play. It optimizes the usage of database connections, reducing latency and improving the performance of your serverless app. Additionally, it ensures that the application can handle sudden spikes in traffic without overwhelming the database.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database with connection pooling support for the serverless use-cases in Rapidapp in seconds here

Step-by-Step Guide to Creating the Todo App

Project Initialization and Dependencies

We will be using Spring Boot and PostgreSQL to build a todo application. You can initialize a spring boot project by using Spring Boot CLI. Once installed, you can use following command to initialize a project with required dependencies.

spring init \
--dependencies=web,data-jpa,postgresql \
--type=maven-project \
--javaVersion=21 \
cloud-run-todo

Line 2: web for implementing REST endpoints, data-jpa for database persistence, and postgresql for PostgreSQL driver.

Line 3: --type=maven-project for creating a Maven project.

Line 4: --javaVersion=21 we will use Java 21 in Google Cloud Run environment.

Now that we initialized the project, go to the folder cloud-run-todo and open it with your favourite IDE.

Implementing Entity and Repository

We have only one entity here, Todo, which will be used to store our todo items. Let's create a new entity called Todo as follows.

@Entity
class Todo {
@Id
@GeneratedValue(strategy = GenerationType.AUTO)
private Integer id;
private String description;
private Boolean completed;

public Todo(String description, Boolean completed) {
this.description = description;
this.completed = completed;
}

public Todo() {

}

public Integer getId() {
return id;
}

public void setId(Integer id) {
this.id = id;
}

public String getDescription() {
return description;
}

public void setDescription(String description) {
this.description = description;
}

public Boolean getCompleted() {
return completed;
}

public void setCompleted(Boolean completed) {
this.completed = completed;
}
}

In order to manage Todo entity in database, we will use following repository interface.

interface TodoRepository extends CrudRepository<Todo, Integer>{}

TodoRepository will be used to do crud operations for the Todo entity

Implementing Rest Endpoints

Since we have only one entity, we will have one root endpoint /api/v1/todos inside one controller and implement 2 actions for create and listing todo entities as follows

@RestController
@RequestMapping("/api/v1/todos")
class TodoController {

private final TodoRepository todoRepository;

TodoController(TodoRepository todoRepository) {
this.todoRepository = todoRepository;
}

@PostMapping
void create(@RequestBody CreateTodoRequest request) {
this.todoRepository.save(new Todo(request.getDescription(), false));
}

@GetMapping
Iterable<Todo> list() {
return this.todoRepository.findAll();
}
}

create method accepts a request CreateTodoRequest as shown below.

class CreateTodoRequest {
private String description;

public CreateTodoRequest(String description) {
this.description = description;
}

public CreateTodoRequest() {
}

public String getDescription() {
return description;
}

public void setDescription(String description) {
this.description = description;
}
}

Now we have persistence layer and rest endpoints ready and we are ready to configure application.

Application Configuration

In serverless environment, it is best practice to expect PORT environment variable since it might be managed by the serverless provider. We can add following configuration to application.properties

application.properties
server.port=${PORT:8080}

By doing this, if there is an env variable PORT, it will take precedence over the default value of 8080. In order to create tables out of entities automatically, we can use following config.

application.properties
spring.jpa.hibernate.ddl-auto=update

As a final step, we need to create a file called project.toml in the root of the project to tell Cloud Run to use Java 21

project.toml
[[build.env]]
name = "GOOGLE_RUNTIME_VERSION"
value = "21"

Deploying to Google Cloud Run

We will be using gcloud cli to deploy our application to Google Cloud Run. Before running deployment command, you need to prepare datasource url, username, and password for PostgreSQL to pass as an environment variable to application. Use following command to deploy.

gcloud run deploy \
--source . \
--update-env-vars SPRING_DATASOURCE_URL=jdbc:postgresql://<host>:<port>/<db>,SPRING_DATASOURCE_USERNAME=<user>,SPRING_DATASOURCE_PASSWORD=<password>

If you are using Rapidapp as your managed database, do not forget to use Pooling Port as port value to use connection pooling for your database to handle highly concurrent requests.

It will prompt for the name of service, you can press enter to accept default one. It will also prompt for the region, select the number of desired region. If there is no problem, it will deploy your application and print the service url.

Demo

Create Todo

curl -XPOST -H "Content-Type: application/json" https://<your>.a.run.app/api/v1/todos -d '{"description": "buy milk"}'

List Todos

curl -XGET https://<your>.a.run.app/api/v1/todos

Conclusion

Deploying a Spring Boot application to Google Cloud Run is straightforward and efficient, allowing developers to leverage the power of serverless computing. By integrating PostgreSQL with connection pooling using PgBouncer and considering services like RapidApp, you can ensure your application is robust and scalable. With this guide, you're now equipped to deploy your todo app to the cloud, ready to handle real-world workloads with ease.

tip

You can find the complete source code for this project on GitHub.

Automating Image Metadata Extraction with AWS Lambda, Go, and PostgreSQL

· 9 min read
Huseyin BABAL
Software Developer

Introduction

In today's digital age, images play a crucial role in various applications and services. However, managing and extracting metadata from these images can be a challenging task, especially when dealing with large volumes of data. In this article, we'll explore how to leverage AWS Lambda, Go, and PostgreSQL to create an automated system for extracting EXIF data from images and storing it in a database.

What is AWS Lambda?

AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers. It automatically scales your applications in response to incoming requests, making it an ideal solution for event-driven architectures. With Lambda, you only pay for the compute time you consume, making it cost-effective for various use cases.

Use-cases

AWS Lambda can be employed in numerous scenarios, including:

  • Real-time file processing
  • Data transformations
  • Automated backups
  • Scheduled tasks
  • Webhooks and API backends

In our case, we'll use Lambda to process images as they're uploaded to an S3 bucket, extract their EXIF data, and store it in a PostgreSQL database.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database with connection pooling support for the serverless use-cases in Rapidapp in seconds here

Implementation

Project Initialization and Dependencies

In this project we will implement a function by using Go which depends on AWS Lambda and PostgreSQL. You can initialize Go project and install the dependencies as follows.

mkdir aws-lambda-go
cd aws-lambda-go
go mod init aws-lambda-go
go get -u github.com/aws/aws-lambda-go/lambda
go get -u github.com/aws/aws-sdk-go-v2/config
go get -u github.com/aws/aws-sdk-go-v2/service/s3
go get -u github.com/lib/pq

Function Endpoint

main.go
package main
...
import "github.com/aws/aws-lambda-go/lambda"
...
func HandleRequest(ctx context.Context, event events.S3Event) (*string, error) {
// Function logic goes here
}

func main() {
lambda.Start(HandleRequest)
}

Line 5: As always, context is used to control execution logic, and since this function is triggered by an S3 event, we'll use the events.S3Event type. This means, once this function is started to run, we will have a payload that contains the S3 event that triggered the function.

Line 10: In this part, the actual function logic is handled by a wrapper lambda.Start coming from aws-lambda package.

Let's deep dive into actual function logic.

Database Connection

We will be getting database connection url from the environment variables, and then connect to the database. It could be good if we also ping the database to be sure it is healthy.

main.go
connStr := os.Getenv("DB_URL")
db, err := sql.Open("postgres", connStr)
if err != nil {
return nil, fmt.Errorf("failed to open database: %s", err)
}
defer db.Close()

err = db.Ping()
if err != nil {
return nil, fmt.Errorf("failed to ping database: %s", err)
}
fmt.Println("Successfully connected to the database!")

Retrieving Object from S3

Once the function triggerred by S3 event, we will get the object from the S3 bucket as follows.

main.go
sdkConfig, err := config.LoadDefaultConfig(ctx)
if err != nil {
return nil, fmt.Errorf("failed to load SDK config: %s", err)
}
s3Client := s3.NewFromConfig(sdkConfig)

var bucket string
var key string
for _, record := range event.Records {
bucket = record.S3.Bucket.Name
key = record.S3.Object.URLDecodedKey

// Get the object
getObjectOutput, err := s3Client.GetObject(ctx, &s3.GetObjectInput{
Bucket: &bucket,
Key: &key,
})
if err != nil {
return nil, fmt.Errorf("failed to get object %s/%s: %s", bucket, key, err)
}
defer getObjectOutput.Body.Close()
...
}

Line 1: If you have ever used AWS SDKs before, you might have seen the credential chaining operation. AWS SDK can use different methods to resolve credentials to create a session to connect AWS services. If you don't pass anything as credentials, it will try to find the credentials in the environment variables. If it cannot find it, then it will use the AWS metadata to understand the identity. In AWS Lambda environment, it knows how to resolve indentity to construct a session in Go.

Line 14: In this part, we will get the object from S3 bucket. We will be using this object to decode image details to get EXIF information.

Extracting EXIF Data

main.go
buf := new(bytes.Buffer)
_, err = buf.ReadFrom(getObjectOutput.Body)
if err != nil {
return nil, fmt.Errorf("failed to read object %s/%s: %s", bucket, key, err)
}

// Check EXIF data
exifData, err := exif.Decode(buf)
if err != nil {
return nil, fmt.Errorf("failed to decode EXIF data: %s", err)
}

log.Printf("successfully retrieved %s/%s with EXIF DateTime: %v", bucket, key, exifData)

Line 2: Create a reader from S3 object contents to use for decoding EXIF data.

Line 8: Extract EXIF data from image

Store in Postgres Database

There are lots of information in image headers, but in our case we will use 2 fields: make and model.

main.go
// SQL statement
sqlStatement := `INSERT INTO images (bucket, key, model, company) VALUES ($1,$2,$3,$4)`

// Execute the insertion
model, err := exifData.Get(exif.Model)
if err != nil {
return nil, fmt.Errorf("failed to get model: %s", err)
}
company, err := exifData.Get(exif.Make)
if err != nil {
return nil, fmt.Errorf("failed to get company: %s", err)
}
_, err = db.Exec(sqlStatement, bucket, key, model.String(), company.String())
if err != nil {
return nil, fmt.Errorf("failed to execute SQL statement: %s", err)
}

We basically read the EXIF data and insert it into the database. You can use following to create images table in your database.

CREATE TABLE images (
bucket varchar(255),
key varchar(255),
model varchar(255),
company varchar(255)
);

bucket - S3 bucket name key - S3 object key model - Model name of the camera used to take the image company - Company name of the camera used to take the image

Now that we implemented our image metadata extraction, let's take a look at how we can deploy this function to AWS Lambda.

Deployment

Preparing Artifact

There is a reason to have a main function in our functions since we are about to build an executable to pass as bootstrap entrypoint to AWS Lambda environment. We need to build an executable, zip it and upload it to AWS Lambda as a new function.

GOOS=linux GOARCH=arm64 go build -tags lambda.norpc -o bootstrap main.go

We build an executable for linux OS and ARM64 architecture by using the main.go as an entrypoint. We use lambda.norpc tag to exclude the RPC library from the executable. This will prevent the RPC library from being included in the executable. This is only used if you are using 1.X Go runtime. Also, we named the executable as bootstrap, this is the entrypoint for AWS Lambda. It will not be executed if you use another name. Finally, we will zip the executable and upload it to AWS Lambda as a new function.

zip PhotoHandler.zip bootstrap

AWS Requirements

Once we deploy the function, it will require set of permission like;

  • Accessing S3 buckets
  • Being able to create log groups in CloudWatch
  • Being able to write to CloudWatch logs We can create an AWS role with the following policy for this purpose and assign it to the Lambda function
trust-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "logs:CreateLogGroup",
"Resource": "arn:aws:logs:<region>:<account-id>:*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents",
"lambda:InvokeFunction"
],
"Resource": [
"arn:aws:logs:<region>:<account-id>:log-group:/aws/lambda/PhotoHandler:*",
"arn:aws:lambda:<region>:<account-id>:function:PhotoHandler"
]
},
{
"Effect": "Allow",
"Action": "s3:GetObject",
"Resource": "*"
}
]
}

Line 7: This part is used to create CloudWatch log group and log stream. Do not forget to use your region and account ID. You can grab the account id with the following command

aws sts get-caller-identity

Line 17-18: This section contains another set of permission for creating log events, also invoking specific function which is PhotoHandler in our case. Again, do not forget to replace region and account id in your case.

Line 23: This section contains the permission to access S3 buckets.

Now you can store this as trust-policy.json and execute the following command to create role.

aws iam create-role \
--role-name photo-handler \
--assume-role-policy-document \
file://trust-policy.json

Remember this role name since we will use it on AWS Lambda function creation.

AWS Lambda Function Creation

You can create a new lambda function as follows.

aws lambda create-function \
--function-name PhotoHandler \
--runtime provided.al2023 \
--handler bootstrap \
--architectures arm64 \
--role arn:aws:iam::<account-id>:role/photo-handler \
--zip-file fileb://PhotoHandler.zip

Line 3: This is the OS only environment, since we already have binary executable, so this can be provided to this env as entrypoint.

Line 6: Do not forget to replace with your account id, this part is needed for binding role to this specific function. This execution runtime will be able to do the operation provided in the trust policy that we created role out of it in previous section.

Adding S3 Events Trigger

In this section, we will add a trigger for S3 event so that this lambda function will be invoked whenever you upload new image to specific S3 bucket.

s3-notification.json
{
"LambdaFunctionConfigurations": [
{
"LambdaFunctionArn": "arn:aws:lambda:<region>:<account-id>:function:PhotoHandler",
"Events": [
"s3:ObjectCreated:*"
],
"Filter": {
"Key": {
"FilterRules": [
{
"Name": "prefix",
"Value": "acme-images/"
},
{
"Name": "suffix",
"Value": ".jpeg"
}
]
}
}
}
]
}

Now you can configure your bucket for the notifications so that it will trigger this lambda function.

aws s3api put-bucket-notification-configuration \
--bucket acme-images \
--notification-configuration file://s3-notification.json

This configure will ensure sending notification about S3 object creation events to trigger AWS lambda function. This event can be consumed inside the HandleRequest function.

Last Step

Now that we added a trigger to lambda function for S3 events. Whenever you add a new jpeg file to acme-images bucket it will invoke lambda function and it will get EXIF data then finally store in PostgreSQL database.

Conclusion

In this article, we explored how to automate image metadata extraction using AWS Lambda, Go, and PostgreSQL. We demonstrated how to use AWS Lambda to handle S3 events, extract EXIF data from images using the exif package in Go, and store the extracted metadata in a PostgreSQL database using the Rapidapp, PostgreSQL As a Service. There will be more Serverless use-cases in the future, do not forget to subscribe for new articles.

tip

You can find the complete source code for this project on GitHub.

Integrating Spring AI with Vector Databases - A Guide Using PGVector

· 7 min read
Huseyin BABAL
Software Developer

What is a Vector Database?

A vector database is a specialized type of database optimized for storing, retrieving, and performing operations on vector data. Vectors, in this context, are typically arrays of numerical values that represent data in a multi-dimensional space. These are widely used in machine learning and AI for tasks like similarity search, where the goal is to find data points that are close to a given query point in this multi-dimensional space. Vector databases provide efficient indexing and querying capabilities for such operations, often leveraging advanced mathematical and computational techniques to ensure fast and accurate results.

What is PGVector?

pgvector is an extension for PostgreSQL that adds support for storing and querying vector data. It allows users to leverage PostgreSQL's powerful database capabilities while adding specialized functionality for vector operations. With pgvector, you can store high-dimensional vectors, perform similarity searches, and integrate vector operations seamlessly with your existing PostgreSQL databases.

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database with pgvector support in Rapidapp in seconds here

How Spring Integrates with Vector Databases

Spring, a popular framework for building Java applications, provides robust support for integrating with various types of databases, including vector databases like pgvector. Using Spring AI PGVector Store, developers can easily manage data access and integrate vector operations into their applications. Spring AI offers additional capabilities to enhance machine learning and AI integrations, making it a powerful choice for applications that require advanced data handling and analytics.

Creating a Spring Project

To get started, we'll create a new Spring project. This can be done using Spring Initializr or any other method you prefer. For simplicity, we'll use Spring Initializr here.

  1. Navigate to Spring Initializr: Open your browser and go to Spring Initializr.
  2. Project Settings: Set the following options:
    • Project: Maven Project
    • Language: Java
    • Spring Boot: (select the latest stable version)
    • Dependencies: Add Spring Web Download the project and unzip it. Open the pom.xml file and add the following dependencies to the dependencies section:
pom.xml
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-transformers-spring-boot-starter</artifactId>
</dependency>

Application YAML Configuration

Next, we need to configure our application to use PostgreSQL as a vector store. Update your application.yaml file as follows:

application.yaml
spring:
datasource:
url: jdbc:postgresql://<host>:<port>/<db>?application_name=rapidapp_spring_ai
username: <user>
password: <password
ai:
ollama:
embedding:
enabled: false
vectorstore:
pgvector:
index-type: hnsw
distance-type: cosine_distance
dimensions: 384

index-type: Specifies the type of index to be used for vector data. Common options include ivfflat and hnsw.

dimension: Indicates the dimensionality of the vectors being stored.

distance-type: Defines the distance metric used for similarity search, such as l2 (Euclidean distance) or ip (Inner Product).

Index Types

You can see the brief descriptions of index types used in pgvector below, but if you want to know more, you can refer here

HNSW

HNSW (Hierarchical Navigable Small World) is an advanced indexing algorithm designed for efficient approximate nearest neighbor search in high-dimensional spaces. It builds a graph structure where each node represents a vector, and edges represent connections to other vectors. The graph is navigable through multiple layers, allowing for fast and scalable searches by traversing the most relevant nodes. HNSW is known for its high accuracy and low search latency, making it suitable for real-time applications requiring quick similarity searches.

IVF Flat

IVF Flat (Inverted File Flat) is a popular indexing method that partitions the vector space into clusters using a coarse quantizer. Each vector is assigned to a cluster, and an inverted list is maintained for each cluster containing the vectors assigned to it. During a search, only the clusters closest to the query vector are examined, significantly reducing the number of comparisons needed. IVF Flat provides a good balance between search speed and accuracy, and it is especially effective when dealing with large datasets, as it limits the scope of the search to relevant clusters.

Distance Types

Distance types are metrics used to measure the similarity or dissimilarity between vectors in a vector database. Different applications and data types may require different distance metrics to ensure accurate and meaningful results. Here are some commonly used distance types

Euclidean Distance (L2)

This is the most widely used distance metric, measuring the straight-line distance between two points in a multi-dimensional space. It's calculated as the square root of the sum of the squared differences between corresponding elements of the vectors. Euclidean distance is suitable for general-purpose similarity searches and is often used in clustering algorithms.

Cosine Similarity

This metric measures the cosine of the angle between two vectors, providing a value between -1 and 1. Cosine similarity is particularly useful when the magnitude of the vectors is not important, focusing instead on the direction. It's commonly used in text mining and natural language processing to measure the similarity of documents or word embeddings.

Inner Product (Dot Product)

This metric calculates the sum of the products of corresponding elements of two vectors. It's often used in neural networks and machine learning models to measure the alignment between vectors. Inner product similarity is useful when comparing vectors where higher values indicate greater similarity.

Manhattan Distance (L1)

Also known as the city block distance, it measures the sum of the absolute differences between corresponding elements of two vectors. Manhattan distance is useful in scenarios where differences in individual dimensions are more significant than the overall geometric distance, such as in certain types of image processing.

Hamming Distance

This metric counts the number of positions at which the corresponding elements of two vectors are different. It's mainly used for binary vectors or strings of equal length, making it suitable for applications in error detection and correction, as well as DNA sequence analysis.

Choosing the right distance type depends on the specific requirements of your application and the nature of your data. Each distance metric has its strengths and weaknesses, and understanding these can help optimize the performance and accuracy of similarity searches in your vector database.

Implementing a Document Controller

Create a new controller that will manage vector data operations. Start by defining a service to handle vector store interactions.

DocumentController.java
@RestController
@RequestMapping("/documents")
class DocumentController {

@Autowired
private VectorStore vectorStore;

@PostMapping
public void create(@RequestBody CreateDocumentRequest request) {
vectorStore.add(List.of(new Document(request.text(), request.meta())));
}

@GetMapping
public String list(@RequestParam("query") String query) {
List<Document> results = vectorStore.similaritySearch(SearchRequest.query(query).withTopK(5));
return results.toString();
}
}

Create Documents

# Document 1
curl \
-H "Content-Type: application/json" \
-d '{"text": "Prometheus collects metrics from targets by scraping metrics HTTP endpoints. Since Prometheus exposes data in the same manner about itself, it can also scrape and monitor its own health.", "meta": {"category": "getting-started"}}' \
http://localhost:8080/documents

# Document 2
curl \
-H "Content-Type: application/json" \
-d '{"text": "Prometheus local time series database stores data in a custom, highly efficient format on local storage.", "meta": {"category": "storage"}}' \
http://localhost:8080/documents

Search Documents

curl http://localhost:8080/documents?query="scrape"

Conclusion

Integrating Spring AI with vector databases like pgvector provides powerful capabilities for handling vector data and performing advanced similarity searches. By leveraging Spring's robust framework and pgvector's specialized vector operations, developers can build sophisticated applications that effectively manage and analyze high-dimensional data. Rapidapp further enhances this setup with its user-friendly interface and built-in vector store support, making it easier than ever to develop and maintain vector-based applications..

tip

You can find the complete source code for this project on GitHub.

Building an Application with JHipster, PostgreSQL, and Elasticsearch in 10 Minutes

· 7 min read
Huseyin BABAL
Software Developer

Introduction

In today’s fast-paced development landscape, creating robust and scalable applications quickly is essential. Leveraging jHipster, PostgreSQL, and Elasticsearch can streamline this process. This article walks you through the steps of building a demo project, showcasing the integration of these powerful tools in just 10 minutes.

Why jHipster?

jHipster accelerates application development by providing a complete stack, including front-end and back-end technologies. It generates high-quality code, follows best practices, and offers extensive tooling, making it a go-to solution for developers seeking efficiency and reliability.

Prerequisites

PostgreSQL

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

Elasticsearch

We will be using Elasticsearch for the search engine. Elasticsearch is an open-source, distributed, and scalable search engine which you can deploy on-premises or in the cloud. You can use Elastic Cloud if you don't want to maintain your own instance.

jHipster CLi

To get started with jHipster, you'll need to install jHipster CLi.

Getting Started

You can simply run jhipster command in your terminal and follow the prompts to get started as shown below. Do not forget to provide your own namings to fields like application name, package name etc.

? What is the base name of your application? demo
? Which *type* of application would you like to create? Monolithic application (recommended for simple projects)
? What is your default Java package name? com.huseyinbabal.demo
? Would you like to use Maven or Gradle for building the backend? Maven
? Do you want to make it reactive with Spring WebFlux? No
? Which *type* of authentication would you like to use? JWT authentication (stateless, with a token)
? Besides JUnit, which testing frameworks would you like to use?
? Which *type* of database would you like to use? SQL (H2, PostgreSQL, MySQL, MariaDB, Oracle, MSSQL)
? Which *production* database would you like to use? PostgreSQL
? Which *development* database would you like to use? PostgreSQL
? Which cache do you want to use? (Spring cache abstraction) Ehcache (local cache, for a single node)
? Do you want to use Hibernate 2nd level cache? Yes
? Which other technologies would you like to use? Elasticsearch as search engine
? Which *framework* would you like to use for the client? React
? Besides Jest/Vitest, which testing frameworks would you like to use?
? Do you want to generate the admin UI? Yes
? Would you like to use a Bootswatch theme (https://bootswatch.com/)? Default JHipster
? Would you like to enable internationalization support? No
? Please choose the native language of the application English

This will generate a full-stack application where PostgreSQL and Elasticsearch is configured and enabled during application application startup. Now that you have the basic setup, you can start configuring the datasource.

PostgreSQL Configuration

Once you open project folder in your favourite IDE, you can see the generated application*.yaml files under src/main/resources/config folder. Since we are doing local development for now, you can open application-dev.yaml and configure datasource as follows.

application-dev.yaml
spring:
datasource:
url: jdbc:postgresql://<host>:<port>/<db_name>?sslmode=require&application_name=rapidapp_jhipster # You can find details on Rapidapp db details page.
username: <username>
password: <password>

Line 3: You can find the DB connection details on Rapidapp db details page.

Elasticsearch Configuration

We will configure Elasticsearch as a search engine in our application. Once you create your own Elasticsearch instance, or create one in Elastic Cloud, note your Elasticsearch credentials to use them in the following configuration section.

application-dev.yaml
spring:
elasticsearch:
uris: https://elastic:<password>@<host>:<port>

Running the Application

Now that you have configured your database and search engine, you can start the application with the following command:

./mvn

Above command will do the followings;

  • Build the frontend and backend projects
  • Start the backend project while running liquibase asynchronously. Liquibase will prepare the database schema by using your entities.
  • Start the frontend project. If everything goes well, you will see an output as follows;
2024-06-19T17:01:11.056+03:00  INFO 65684 --- [  restartedMain] com.huseyinbabal.jdemo.JDemoApp          :
----------------------------------------------------------
Application 'jDemo' is running! Access URLs:
Local: http://localhost:8080/
External: http://192.168.1.150:8080/
Profile(s): [dev, api-docs]
----------------------------------------------------------

You can simply navigate to http://localhost:8080/ to access the application. It will show you the default credentials for users with admin and user rights, you can login with admin:admin credentials to see how admin UI looks like. You can see the critical components below;

  • Entities: Entities used in this application. We will see this soon to create our own entities to use in the application.
  • Administration > Metrics: You can see several metrics like JVM, Cache, HTTP statistics.
  • Administration > Health: You can see the health information of the application like db, disk health.
  • Administration > Logs: You can see the log configuration of the application where you can set log level in root or package level. Feel free to walk through the menus in Admin UI menu to get familiar with them, meanwhile, let's see how we can add our own entities to application.

Adding Entities

We will add our own entities to the application. Let's create a new entity called Product and add it to the application with the following command

jhipster entity product

It will prompt you to add fields for this entity. You can use following fields;

  • title: String
  • description: String
  • price: Float Once it is done, it will create necessary entity in codebase and related controller for the CRUD operations. src/main/java/<package>/domain/Product.java contains the generated entity class and src/main/java/<package>/repository/ProductRepository.java contains the generated repository class. In order to access resource information which is the presentation layer, you can take a look at src/main/java/<package>/web/rest/ProductResource.java. Let's take a look at how it creates a product as shown below.
ProductResource.java
@PostMapping("")
public ResponseEntity<Product> createProduct(@RequestBody Product product) throws URISyntaxException {
log.debug("REST request to save Product : {}", product);
if (product.getId() != null) {
throw new BadRequestAlertException("A new product cannot already have an ID", ENTITY_NAME, "idexists");
}
product = productRepository.save(product);
productSearchRepository.index(product);
return ResponseEntity.created(new URI("/api/products/" + product.getId()))
.headers(HeaderUtil.createEntityCreationAlert(applicationName, false, ENTITY_NAME, product.getId().toString()))
.body(product);
}

Line 8: productSearchRepository.index(product) is used to index the product in Elasticsearch. You see how easy it is to store product data in elasticsearch. We haven't written any code for that, but since we have added elasticsearch config, jHipster becomes an elasticsearch-aware system where it also generates required functions for you.

In same way, let's see how it searches product as shown below.

ProductResource.java
@GetMapping("/_search")
public ResponseEntity<List<Product>> searchProducts(
@RequestParam("query") String query,
@org.springdoc.core.annotations.ParameterObject Pageable pageable
) {
log.debug("REST request to search for a page of Products for query {}", query);
try {
Page<Product> page = productSearchRepository.search(query, pageable);
HttpHeaders headers = PaginationUtil.generatePaginationHttpHeaders(ServletUriComponentsBuilder.fromCurrentRequest(), page);
return ResponseEntity.ok().headers(headers).body(page.getContent());
} catch (RuntimeException e) {
throw ElasticsearchExceptionMapper.mapException(e);
}
}

Line 8: productSearchRepository.search(query, pageable) is used to search product in elasticsearch.

Product Entity in Admin UI

As you can see, we have created a product entity in Admin UI. You can see it in Entities menu. Once you navigate to Product module, you can create new Product, list products, view details of a product or delete any of them. Also, there is a search bar where you can search products with the help of Elasticsearch on the backend side.

Conclusion

We have seen that you can leverage jHipster's rapid development capabilities along with the robust data management of PostgreSQL and powerful search functionality of Elasticsearch. jHipster simplifies and accelerates application creation with its comprehensive toolset and best practices, while PostgreSQL ensures reliable and efficient data handling. Elasticsearch adds advanced search capabilities, making your application both scalable and responsive. Utilizing Rapidapp's PostgreSQL as a service further streamlines database management, allowing you to focus on developing high-quality applications quickly and effectively.

tip

You can find the complete source code for this project on GitHub.

Building Devops AI Assistant with Langchain, Ollama, and PostgreSQL

· 6 min read
Huseyin BABAL
Software Developer

Introduction

Vector databases emerge as a powerful tool for storing and searching high-dimensional data like document embeddings, offering lightning-fast similarity queries. This article delves into leveraging PostgreSQL, a popular relational database, as a vector database with the pgvector extension. We'll explore how to integrate it into a LangChain workflow for building a robust question-answering (QA) system.

What are Vector Databases?

Imagine a vast library holding countless documents. Traditional relational databases might classify them by subject or keyword. But what if you want to find documents most similar to a specific concept or question, even if keywords don't perfectly align? Vector databases excel in this scenario. They store data as numerical vectors in a high-dimensional space, where closeness in the space reflects semantic similarity. This enables efficient retrieval of similar documents based on their meaning, not just exact keyword matches.

PostgreSQL as Vector Database

PostgreSQL, a widely adopted and versatile relational database system, can be empowered with vector search capabilities using the pgvector extension. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

If you maintain PostgreSQL database on your own, you can enable pgvector extension by executing the following command for each database as shown below.

CREATE EXTENSION vector;

LangChain: Building Flexible AI Pipelines

LangChain is a powerful framework that facilitates the construction of modular AI pipelines. It allows you to chain together various AI components seamlessly, enabling the creation of complex and customizable workflows.

Your Use Case: Embedding Data for AI-powered QA

In your specific scenario, you're aiming to leverage vector search to enhance a question-answering system. Here's how the components might fit together:

  • Data Preprocessing: Process your documents (e.g., web pages) using Natural Language Processing (NLP) techniques to extract relevant text content. Generate vector representations of your documents using an appropriate AI library (e.g., OllamaEmbeddings in your code).

  • Embedding Storage with pgvector: Store the document vectors and their corresponding metadata (e.g., titles, URLs) in your PostgreSQL database table using pgvector.

  • Building the LangChain Workflow: Construct a LangChain pipeline that incorporates the following elements:

    • Retriever: This component retrieves relevant documents from your PostgreSQL database using vector similarity search powered by pgvector. When a user poses a question, the retriever searches for documents with vector representations closest to the query's vector.
    • Question Passage Transformer: (Optional) This component can further process the retrieved documents to extract snippets most relevant to the user's query.
    • Language Model (LLM): This component uses the retrieved context (potentially augmented with question-specific passages) to formulate a comprehensive response to the user's question.

DevOps AI Assistant: Step-by-step Implementation

We will implement the application by using Pyhton, and will use Poetry for dependency management.

Project Creation

Create a directory and initiate a project by running the following command:

poetry init

This will create a pyproject.toml file in the current directory.

Dependencies

You can install dependencies by running the following command:

poetry add langchain-cohere \
langchain-postgres \
langchain-community \
html2text \
tiktoken

Once you installed dependencies, you can create a empty main.py file to implement our business logic.

Preparing the PostgreSQL Connection URL

Once you create your database on Rapidapp, or use your own database, you can construct the PostgreSQL connection URL as follows. postgresql+psycopg://<user>:<pass>@<host>:<port>/<db>

Defining the Vector Store

connection = "<connection_string>"
collection_name = "prometheus_docs"
embeddings = OllamaEmbeddings()

vectorstore = PGVector(
embeddings=embeddings,
collection_name=collection_name,
connection=connection,
use_jsonb=True,
)

As you can see, we use embedding in the codebase. Your implementation can interact with different AI providers like OpenAI, HuggingFace, HuggingFace, Ollama, etc. Embedding provides a standard interface for all of them. In our case, we use OllamaEmbeddings, since we will be using Ollama as AI provider.

Line 2: This is the collection name in the PostgreSQL database where we store vector documents. In our case, we will store a couple of Prometheus document to help AI provider to decide answers to user's questions.

Line 5: LangChain has lots of vector store implementations and PGVector is one of them. This will help us to interact vector search with PostgreSQL database.

Indexing Documents

urls = ["https://prometheus.io/docs/prometheus/latest/getting_started/", "https://prometheus.io/docs/prometheus/latest/federation/"]
loader = AsyncHtmlLoader(urls)
docs = loader.load()

htmlToText = Html2TextTransformer()
docs_transformed = htmlToText.transform_documents(docs)

splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
chunk_size=1000, chunk_overlap=0
)
docs = splitter.split_documents(docs_transformed)
vectorstore.add_documents(docs)

Line 1-3: With the help of AsyncLoader, we simply load 2 documentation pages of Prometheus.

Line 5-6: Since we cannot use raw html files, we will convert them to text using Html2TextTransformer.

Line 8-11: RecursiveCharacterTextSplitter helps by chunking large text documents into manageable pieces that comply with vector store limitations, improve embedding efficiency, and potentially enhance retrieval accuracy.

Line 12: Store processed documents into vector store.

Building the LangChain Workflow

retriever = vectorstore.as_retriever()
llm = Ollama()

message = """
Answer this question using the provided context only.

{question}

Context:
{context}
"""

prompt = ChatPromptTemplate.from_messages([("human", message)])

rag_chain = {"context": retriever, "question": RunnablePassthrough()} | prompt | llm
response = rag_chain.invoke("how to federate on prometheus")
print(response)

Above code snippet demonstrates how to use LangChain to retrieve information from a vector store and generate a response using a large language model (LLM) based on the retrieved information. Let's break it down step-by-step:

Line 1: This line assumes you have a vector store set up and imports a function to use it as a retriever within LangChain. The retriever will be responsible for fetching relevant information based on a query.

Line 2: This line initializes an instance of the Ollama LLM, which will be used to generate the response to the question.

Line 4: The code defines a multi-line string variable named message. This string uses a template format to include two sections: question: This section will hold the specific question you want to answer. context: This section will contain the relevant background information for the question.

Line 13: Generates chat prompt template.

Line 15: Here the question and context is piped to template to generate prompt, then passed to llm to generate the response. Be sure this is a runnable chain.

Line 16: We invoke the chain with a question and get the response.

Conclusion

In this practical guide, we've delved into using PostgreSQL as a vector database, leveraging the pgvector extension. We explored how this approach can be used to build a context-aware AI assistant, focusing on Prometheus documentation as an example. By storing document embeddings alongside their metadata, we enabled the assistant to retrieve relevant information based on semantic similarity, going beyond simple keyword matching. LangChain played a crucial role in this process. Its modular framework allowed us to effortlessly connect various AI components, like PGVector for vector retrieval and OllamaEmbeddings for interacting with our chosen AI provider. Furthermore, LangChain's ability to incorporate context within user questions significantly enhances the relevance and accuracy of the assistant's responses.

tip

You can find the complete source code for this project on GitHub.

Securing Your Spring Boot App with JWT Authentication

· 8 min read
Huseyin BABAL
Software Developer

Introduction

This article dives into securing a Spring Boot application using JSON Web Tokens (JWT) for authentication. We'll explore Spring Security, JWT fundamentals, and then implement a secure API with user registration, login, and access control. Our data will be persisted in a PostgreSQL database using Spring Data JPA.

Why Spring Security?

Spring Security is an industry-standard framework for securing Spring applications. It offers comprehensive features for authentication, authorization, and access control. By leveraging Spring Security, we can efficiently manage user access to our API endpoints.

JWT Authentication Explained

JWT is a token-based authentication mechanism. Unlike traditional session-based methods, JWT stores user information in a compact, self-contained token. This token is sent with every request, allowing the server to verify the user's identity without relying on server-side sessions.

Here's a breakdown of JWT's benefits:

  • Stateless: Removes the need for session management on the server.
  • Secure: Employs digital signatures to prevent tampering.
  • Flexible: Can be configured with various claims to store user information.

Persistence Layer

In this article, we will be using PostgreSQL as our database. You can maintain your database in any database management system. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.

tip

Create a free database in Rapidapp in seconds here

Step-by-Step Implementation

Dependencies

Be sure you have the following dependencies installed by using your favourite dependency management tool e.g. maven, gradle.

pom.xml
 <dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.7.3</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.32</version>
</dependency>
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt-api</artifactId>
<version>0.12.5</version>
</dependency>
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt-impl</artifactId>
<version>0.12.5</version>
</dependency>
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt-jackson</artifactId>
<version>0.12.5</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>

Enabling Spring Web Security

In order to enable Spring Web Security, you need to configure it in your SecurityConfig.java file as shown below.

SecurityConfig.java
@Configuration
@EnableWebSecurity
@RequiredArgsConstructor
public class SecurityConfig {

private static final String[] AUTH_WHITELIST = {
"/api/v1/auth/login",
"/api/v1/auth/register"
};

private final JwtAuthFilter jwtAuthFilter;

@Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
http
.csrf(AbstractHttpConfigurer::disable)
.authorizeRequests(authorizeRequests ->
authorizeRequests
.requestMatchers(AUTH_WHITELIST).permitAll()
.anyRequest().authenticated()
)
.sessionManagement(sessionManagement ->
sessionManagement
.sessionCreationPolicy(SessionCreationPolicy.STATELESS))
.addFilterBefore(jwtAuthFilter, UsernamePasswordAuthenticationFilter.class);
return http.build();
}
}

Line 2: Add @EnableWebSecurity to the SecurityConfig class to protect the API endpoints.

Line 6: Allow requests from the /api/v1/auth/login and /api/v1/auth/register endpoints without authentication.

Line 16: Disable CSRF protection, since JWT authentication is stateless.

Line 24: Set the session creation policy to STATELESS to ensure sessions are not maintained.

Line 25: Add the JwtAuthFilter to the security filter chain before the UsernamePasswordAuthenticationFilter. We will explain JwtAuthFilter class soon.

JWT Auth Filter

In order to enable JWT authentication, you need to configure it in your JwtAuthFilter.java file as shown below.

JwtAuthFilter.java
@Component
@RequiredArgsConstructor
public class JwtAuthFilter extends OncePerRequestFilter {

private final JwtService jwtService;
private final UserDetailsService userDetailsService;

@Override
protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain filterChain) throws ServletException, IOException {
if (request.getServletPath().contains("/api/v1/auth")) {
filterChain.doFilter(request, response);
return;
}

final String authorizationHeader = request.getHeader("Authorization");
final String jwtToken;
final String email;

if (authorizationHeader == null || !authorizationHeader.startsWith("Bearer ")) {
filterChain.doFilter(request, response);
return;
}

jwtToken = authorizationHeader.substring(7);
email = jwtService.extractEmail(jwtToken);

if (email != null && SecurityContextHolder.getContext().getAuthentication() == null) {
UserDetails userDetails = userDetailsService.loadUserByUsername(email);
if (jwtService.validateToken(jwtToken, userDetails)) {
UsernamePasswordAuthenticationToken authenticationToken = new UsernamePasswordAuthenticationToken(userDetails, null, userDetails.getAuthorities());
authenticationToken.setDetails(new WebAuthenticationDetailsSource().buildDetails(request));
SecurityContextHolder.getContext().setAuthentication(authenticationToken);
}
}
filterChain.doFilter(request, response);
}
}

Line 10: Do not apply JWT auth filter for /api/v1/auth endpoints.

Line 24: Extract JWT token from the Authorization header. Its format is Bearer <token>, that's why it is substring(7).

Line 25: Extract email from the JWT token using JwtService which we will take a look at in the next section.

Line 28-32: Validate the JWT token using JwtService, load user details using UserDetails from UserDetailsService and store the authentication in SecurityContextHolder.

Implementing JWTService

This class contains all JWT related functionalities as shown below.

JwtService.java
@Service
public class JwtService {

@Value("${jwt.secret}")
private String secret;

public String extractEmail(String jwtToken) {
return extractClaim(jwtToken, Claims::getSubject);
}

public <T> T extractClaim(String jwtToken, Function<Claims, T> claimsResolver) {
final Claims claims = extractAllClaims(jwtToken);
return claimsResolver.apply(claims);
}

private Claims extractAllClaims(String jwtToken) {
return Jwts.parser().verifyWith(getSigningKey()).build().parseSignedClaims(jwtToken).getPayload();
}

private SecretKey getSigningKey() {
byte [] bytes = Decoders.BASE64.decode(secret);
return Keys.hmacShaKeyFor(bytes);
}

public boolean validateToken(String jwtToken, UserDetails userDetails) {
final String email = extractEmail(jwtToken);
return email.equals(userDetails.getUsername()) && !isTokenExpired(jwtToken);
}

private boolean isTokenExpired(String jwtToken) {
return extractExpiration(jwtToken).before(new Date());
}

private Date extractExpiration(String jwtToken) {
return extractClaim(jwtToken, Claims::getExpiration);
}

public String generateToken(User u) {
return createToken(u.getEmail());
}

private String createToken(String email) {
return Jwts.builder()
.subject(email)
.issuedAt(new Date(System.currentTimeMillis()))
.expiration(new Date(System.currentTimeMillis() + 1000 * 60 * 60 * 10))
.signWith(getSigningKey())
.compact();
}
}

Line 5: This is the secret key used to sign JWT tokens. This should be carefully protected, it is not something that we can share or expose publicly. All the other functions are self-explanatory.

UserDetailsService

UserDetailsService is design for showing spring boot security authentication how to load user details from database as shown below.

UserDetailsService.java
@Service
@RequiredArgsConstructor
public class UserDetailService implements UserDetailsService {
private final UserRepository userRepository;


@Override
public UserDetails loadUserByUsername(String email) throws UsernameNotFoundException {
return userRepository.findByEmail(email)
.map(user -> User.builder().username(user.getEmail())
.password(user.getPassword())
.build())
.orElseThrow(() -> new UsernameNotFoundException("User not found"));
}
}

Until this point, we have only focused on JWT authentication. However, how we will generate JWT tokens in the next section? What is its use-case?

Registering User

Before generating JWT token to authenticate the user, we need to register the user. We will use AuthController to register user.

AuthController.java
@RestController
@RequestMapping(path = "api/v1/auth")
@RequiredArgsConstructor
public class AuthController {

private final AuthService authService;


@PostMapping(path = "/register")
@ResponseStatus(HttpStatus.NO_CONTENT)
public void register(@RequestBody RegisterRequest registerRequest) {
authService.register(registerRequest);
}

@PostMapping(path = "/login")
public ResponseEntity<String> login(@RequestBody LoginRequest loginRequest) {
return ResponseEntity.ok(authService.login(loginRequest));
}
}

In above controller, we are using AuthService to register and login user. AuthService uses UserRepository to interact database for user related operations.

AuthService.java
@Service
@RequiredArgsConstructor
public class AuthService {

private final UserRepository userRepository;
private final AuthenticationManager authenticationManager;
private final JwtService jwtService;
private final BCryptPasswordEncoder bCryptPasswordEncoder;

public void register(RegisterRequest registerRequest) {
User u = User.builder()
.email(registerRequest.getEmail())
.password(bCryptPasswordEncoder.encode(registerRequest.getPassword()))
.firstName(registerRequest.getFirstName())
.lastName(registerRequest.getLastName())
.build();
userRepository.save(u);
}

public String login(LoginRequest loginRequest) {
authenticationManager.authenticate(new UsernamePasswordAuthenticationToken(loginRequest.getEmail(), loginRequest.getPassword()));
User u = userRepository.findByEmail(loginRequest.getEmail()).orElseThrow(() -> new EntityNotFoundException("User not found"));
return jwtService.generateToken(u);

}
}

Line 10: Register user by using the details provided in the request payload. The bCryptPasswordEncoder is used to hash the password before storing it in the database.

Line 21: The login operation is done through authenticationManager since it knows how to validate username and password.

Restricted Access to UserController

You can see a sample endpoint implementation for user object.

UserController.java
@RestController
@RequestMapping(path = "api/v1")
public class UserController {
private final UserRepository userRepository;
public UserController(UserRepository userRepository) {
this.userRepository = userRepository;
}

@GetMapping("/users")
public List<User> getUsers() {
return userRepository.findAll();
}
}

Assume you registered a new user with email admin password ssshhhh. Then in order to generate a JWT token, you can use the following curl request.

curl -X POST -H "Content-Type: application/json" \
-d '{"email": "admin", "password": "ssshhhh"}' http://localhost:8080/api/v1/auth/login

It will return a JWT token, which you can use to authenticate the user. Store it somewhere.

Now in order to access restricted user endpoint, you can use the following curl request.

curl -X GET -H "Authorization: Bearer <token>" http://localhost:8080/api/v1/users

Conclusion

This hands-on tutorial equipped you with the knowledge to implement JWT Authentication in your Spring Boot application. We explored user registration, login, and access control, leveraging Spring Security and JPA for data persistence. By following these steps and customizing the code examples to your specific needs, you can secure your API endpoints and ensure authorized user access. Remember to prioritize security best practices. Here are some additional points to consider:

  • Secret Key Management: Store your JWT secret key securely in environment variables or a dedicated secret management service. Never expose it in your codebase.
  • Token Expiration: Set a reasonable expiration time for JWT tokens to prevent unauthorized access due to compromised tokens.
  • Error Handling: Implement proper error handling mechanisms for invalid or expired tokens to provide informative feedback to users.
  • Advanced Features: Explore advanced JWT features like refresh tokens for longer-lived sessions and role-based access control (RBAC) for granular authorization. With JWT authentication in place, your Spring Boot application is well on its way to becoming a secure and robust platform. Deploy it with confidence, knowing that user access is properly controlled.
tip

You can find the complete source code for this project on GitHub.

Building a Realtime Chat App with React, Node.js, and PostgreSQL

· 6 min read
Huseyin BABAL
Software Developer

Introduction

In today's interactive world, real-time communication thrives. This article guides you through building a basic real-time chat application using React for the frontend, Node.js for the backend, and PostgreSQL for data persistence. We'll leverage WebSockets to establish a persistent connection between clients (web browsers) and the server, enabling instant message updates.

Understanding WebSockets

Imagine a two-way highway where messages flow seamlessly between clients and servers. That's the essence of WebSockets. Unlike traditional HTTP requests, which are one-off interactions, WebSockets facilitate a long-lasting connection, allowing for real-time data exchange.

WebSockets in Action

WebSockets empower a variety of applications:

  • Chat Applications: Deliver messages instantly, fostering a more engaging conversation experience.
  • Collaborative Editing: Enable multiple users to work on a document simultaneously, seeing changes in real-time.
  • Stock Tickers: Continuously update stock prices on a financial dashboard.
  • Multiplayer Games: Facilitate smooth real-time interactions among players.
  • Social Media Updates: Receive notifications and updates without page reloads.

PostgreSQL: A Robust Database

PostgreSQL, a powerful open-source object-relational database (ORM), provides an excellent platform for storing chat messages and potentially user information. Its key features include:

  • ACID Transactions: Ensure data integrity through atomicity, consistency, isolation, and durability.
  • Scalability: Handle large chat datasets efficiently.
  • Flexibility: Model complex data structures for additional features. For a convenient deployment option, consider cloud-based solutions like Rapidapp, which offers managed PostgreSQL databases, simplifying setup and maintenance.
tip

Create a free database in Rapidapp in seconds here

Backend: Node.js WebSocket Server

Here's a breakdown of the Node.js code utilizing websocket to establish a WebSocket connection and handle message broadcasting:

server.js
const WebSocket = require('websocket').server;
const http = require('http');
const cors = require('cors');
const Sequelize = require('sequelize');

const sequelize = new Sequelize(process.env.DATABASE_URL)

const Message = sequelize.define('Message', {
username: Sequelize.DataTypes.STRING,
text: Sequelize.DataTypes.STRING,
timestamp: Sequelize.DataTypes.DATE
});

sequelize.authenticate().then(() =>{
console.log("Connection has been established successfully.")
Message.sync();
createServer();
}).catch((err) => {
console.error("Unable to connect to the database:", err)
})

function createServer() {

const httpServer = http.Server((req, res) => {
cors()(req, res, () => {
if (req.url === '/messages') {
fetchMessages().then((messages) => {
res.writeHead(200, {
'Content-Type': 'application/json'
})
res.end(JSON.stringify(messages))
})
}
})
})

const webSocketServer = new WebSocket({
httpServer: httpServer
})

webSocketServer.on('request', (req) => {
const connection = req.accept(null, req.origin);

connection.on('message', (message) => {
let msg = JSON.parse(message.utf8Data)
Message.create({
username: msg.username,
text: msg.text,
timestamp: msg.timestamp

})
webSocketServer.broadcast(JSON.stringify(message))
})

connection.on('close', () => {
// defer conn
})
})

httpServer.listen(3005, () => console.log('Listening on port 3005'));
}

function fetchMessages() {
return Message.findAll();
}

Line 1-4: Import necessary modules and libraries. We use the websocket library to create a WebSocket server. The http module is used to create an HTTP server, and cors is used to enable Cross-Origin Resource Sharing to be able to access http backend from a different origin.

Line 6: Sequelize is a Node.js ORM library to access various databases. Create a Sequelize instance with the connection string from Rapidapp or your own managed PostgreSQL service. The url format should be like postgresql://<user>:<password>@<host>:<port>/<dbname>?ssl=true&sslmode=no-verify&application_name=<client_name>

Line 8-12: Define a Message model with username, text, and timestamp fields.

Line 14-20: Authenticate the connection to the database and create the Message table if it doesn't exist. If the connection is successful, start the server.

Line 24-35: Create an HTTP server to handle incoming requests. If the request URL is /messages, fetch all messages from the database and return them as JSON.

Line 37-39: Create a WebSocket server using the websocket library and attach it to the HTTP server.

Line 44-53: Handle incoming WebSocket requests. When a message is received, parse it, save it to the database, and broadcast it to all connected clients.

Line 60: Start the HTTP server on port 3005.

Frontend: React with WebSocket Connection

Here's the React component managing the chat interface, message sending, and receiving messages through the WebSocket connection:

App.js
import './App.css';
import {useEffect, useState, useRef} from "react";

function App() {

const [messages, setMessages] = useState([]);
const [messageInput, setMessageInput] = useState('');

const socket = useRef(null);

const [username, setUsername] = useState('');

const [existingUserName, setExistingUsername] = useState(localStorage.getItem('username') || '');


useEffect(() => {
socket.current = new WebSocket('ws://localhost:3005');

socket.current.onopen = () => {
console.log('Connected successfully.');
}

socket.current.onmessage = (event) => {
const msg = JSON.parse(event.data);
setMessages([...messages, JSON.parse(msg['utf8Data'])])
}

return () => {
socket.current.close();
};
}, [messages]);

useEffect(() => {
fetchMessages();
}, []);


const sendMessage = () => {
if (messageInput.trim() === '') {
return
}

const message = {
text: messageInput,
username: existingUserName,
timestamp: new Date().toISOString(),
}

socket.current.send(JSON.stringify(message))
setMessageInput('')
}

const fetchMessages = () => {
fetch('http://localhost:3005/messages')
.then((res) => res.json())
.then((data) => {
setMessages(data)
})
}

return (
<div className="App">
{existingUserName ? (
<div className="chat-container">
<div className="chat-messages">
{messages.map((message, index) => (
<div className="message sent">
<span className="message-timestamp">{message['username']}</span>
<span className="message-content">{message['text']}</span>
</div>
))}
</div>
<div className="chat-input">
<input
type="text"
placeholder="Type your message"
value={messageInput}
onChange={(e) => setMessageInput(e.target.value)}
onKeyDown={(e) => {
if (e.key === 'Enter' && messageInput.trim() !== '') {
sendMessage();
}
}}
/>
<button onClick={sendMessage}>Send</button>
</div>
</div>) : (
<div className="chat-input">
<input
type="text"
placeholder="Enter your username"
value={username}
onChange={(e) => setUsername(e.target.value)}
/>
<button onClick={() => {
const u=username.trim()
localStorage.setItem('username', u);
setExistingUsername(u)
}}>Connect</button>
</div>
)}
</div>
);
}

export default App;

Line 17: Connect to the WebSocket server running on ws://localhost:3005.

Line 23: Handle incoming messages from the WebSocket server. Parse the message and update the messages list.

Line 34: Fetch messages from the server when the component mounts.

Line 38: A function that sends a message to the WebSocket server.

Line 63: Render the chat interface. If the user has not set a username, prompt them to enter one.

Conclusion

This article has provided a foundational understanding of building a real-time chat application using React, Node.js, WebSockets, and PostgreSQL. The code snippets demonstrate the core functionalities of establishing a WebSocket connection, sending messages, broadcasting them to connected clients, and persisting messages in a database. Remember:

  • This is a simplified example, and real-world applications might require additional features like user authentication, authorization, message editing/deletion, and handling disconnects gracefully.
  • Consider implementing security measures to protect your application from malicious attacks.
  • For large-scale applications, further optimize the database structure and explore scaling strategies.
  • By building upon this foundation and tailoring it to your specific needs, you can create a dynamic and engaging real-time chat application.
tip

You can find the complete source code for this project on GitHub.