Building a Serverless Second-hand Marketplace Chatbot using AWS WebSocket API with GPT-3.5

Sigrid Jin
9 min readJul 2, 2023

--

Reference: https://docs.aws.amazon.com/apigateway/latest/developerguide/websocket-api-chat-app.html

The serverless architectures have been a game-changer, allowing developers to focus on their application logic without managing servers or infrastructure. I tried to create a serverless chatroom that uses the OpenAI GPT-3 model to simulate interactions between buyers and sellers, providing a realistic environment for testing and development. The application is implemented using AWS services such as DynamoDB and API WebSocket Gateway, and it utilizes AWS Lambda for serverless computing.

The power of Serverless

Serverless is a cloud computing model where the cloud provider manages server provisioning and operations. Serverless architectures typically scale automatically, are highly available, and require you to pay only for the resources your application consumes.

The Serverless Framework is an open-source tool to build and deploy serverless applications on multiple providers, including AWS. In our case, the developer used this framework for AWS Lambda function deployment, allowing the app to run in response to events like HTTP requests through the Amazon API Gateway.

WebSocket API

WebSocket is a communication protocol providing full-duplex communication channels over a single TCP connection. The AWS API Gateway supports WebSocket APIs, providing real-time, two-way communication between the client and the server.

// https://github.com/Web3-Study-with-Sigrid-Jin/serverless-chatgpt-ws/blob/dev/src/functions/hello/handler.ts
import { ChatCompletionRequestMessageRoleEnum, Configuration, OpenAIApi } from 'openai';
import { DynamoDB } from 'aws-sdk';
import { newApiGatewayManagementApi } from '@yingyeothon/aws-apigateway-management-api';
import { APIGatewayProxyEvent } from 'aws-lambda';

interface MessageBody {
prompt: string;
conversationId: string;
}

const configuration = new Configuration({
apiKey: 'your-api-key-here',
});

const openai = new OpenAIApi(configuration);

const conversations: { [key: string]: Array<{ role: ChatCompletionRequestMessageRoleEnum; content: string }> } = {};

export const connect = async (event: APIGatewayProxyEvent) => {
await new DynamoDB()
.putItem({
TableName: 'ConnectionIds',
Item: {
connectionId: { S: event.requestContext.connectionId! },
},
})
.promise();
return {
statusCode: 200,
body: JSON.stringify({ msg: 'OK' }),
};
};

export const disconnect = async (event: APIGatewayProxyEvent) => {
delete conversations[event.requestContext.connectionId!];
await new DynamoDB()
.deleteItem({
TableName: 'ConnectionIds',
Key: {
connectionId: { S: event.requestContext.connectionId! },
},
})
.promise();
return {
statusCode: 200,
body: JSON.stringify({ msg: 'OK' }),
};
};

export const message = async (event: APIGatewayProxyEvent) => {
const { prompt, conversationId }: MessageBody = JSON.parse(event.body!);

console.log('prompt', prompt);

if (!conversations[conversationId]) {
conversations[conversationId] = [
{ role: 'system', content: '중고나라에서 물품 판매하는 사람처럼 행동하라. 너는 이제 구매자와 대화하게 될 것이다. 구매자와 중고 물건을 판매한다고 생각하고 최대한 높은 가격에 팔기 위해 노력하라. 너는 동네 50대 한국인 아줌마처럼 생각하고 행동하고 답해야 한다. 경어체를 쓰지 말고 반말체를 사용하라.' },
];
}

conversations[conversationId].push({ role: 'user', content: prompt });

const completion = await openai.createChatCompletion({
model: 'gpt-3.5-turbo-16k',
messages: conversations[conversationId],
temperature: 0.6,
max_tokens: 100,
});

const openaiResponse = completion.data.choices[0].message.content;

conversations[conversationId].push({ role: 'assistant', content: openaiResponse });

console.log('openaiResponse', openaiResponse);

const dbResult = await new DynamoDB()
.scan({
TableName: 'ConnectionIds',
ProjectionExpression: 'connectionId',
})
.promise();

const api = newApiGatewayManagementApi({
endpoint: event.requestContext.domainName! + '/' + event.requestContext.stage!,
});

console.log('dbResult', dbResult);

await Promise.all(
dbResult.Items.map(({ connectionId }) =>
api
.postToConnection({
ConnectionId: connectionId.S,
Data: JSON.stringify({
message: openaiResponse,
conversationId,
}),
})
.promise(),
),
);

console.log('api', api);

return {
statusCode: 200,
body: JSON.stringify({ msg: 'OK' }),
};
};

Let’s dissect the code above to understand its operation. The function uses several AWS services, including DynamoDB for storing connection identifiers and API Gateway for managing WebSocket connections.

We start by importing all the necessary libraries and defining the initial setup.

import { ChatCompletionRequestMessageRoleEnum, Configuration, OpenAIApi } from 'openai';
import { DynamoDB } from 'aws-sdk';
import { newApiGatewayManagementApi } from '@yingyeothon/aws-apigateway-management-api';
import { APIGatewayProxyEvent } from 'aws-lambda';

The first part of the code sets up OpenAI’s GPT-3 model by providing the API key.

const configuration = new Configuration({
apiKey: 'your-api-key-here',
});

const openai = new OpenAIApi(configuration);

Next, we define a dictionary to store the ongoing conversations. Each conversation is stored against its connectionId and consists of a list of messages. Each message has a role ("system", "user", or "assistant") and content:

const conversations: { [key: string]: Array<{ role: ChatCompletionRequestMessageRoleEnum; content: string }> } = {};

Lambda function handlers like connect, disconnect, and message respond to different WebSocket actions.

  • Connect Handler: When a client establishes a connection with the WebSocket, AWS triggers the connect event. This function saves the connection ID to DynamoDB for later use.
  • Disconnect Handler: The disconnect event is triggered when a client disconnects from the WebSocket. This function deletes the related connection ID from DynamoDB.
export const connect = async (event: APIGatewayProxyEvent) => {
await new DynamoDB()
.putItem({
TableName: 'ConnectionIds',
Item: {
connectionId: { S: event.requestContext.connectionId! },
},
})
.promise();
return {
statusCode: 200,
body: JSON.stringify({ msg: 'OK' }),
};
};

export const disconnect = async (event: APIGatewayProxyEvent) => {
delete conversations[event.requestContext.connectionId!];
await new DynamoDB()
.deleteItem({
TableName: 'ConnectionIds',
Key: {
connectionId: { S: event.requestContext.connectionId! },
},
})
.promise();
return {
statusCode: 200,
body: JSON.stringify({ msg: 'OK' }),
};
};
  • Message Handler: When a client sends a message, the message event is triggered. This function handles the processing of chat messages. When a message is received, prompt and conversationId are extracted. The conversation details are stored in a local object conversations. If a conversation doesn’t exist, a new one is created with an initial system message to instruct the model’s behaviour.
  1. It retrieves the prompt and conversationId from the message body.
  2. It checks if the conversation exists in the conversations object. If it does not, it initializes a new conversation with a system message.
  3. It pushes the user’s message to the conversations object.
  4. It calls the OpenAI API to generate a response, using the conversation history for context.
  5. It pushes the assistant’s message (OpenAI’s response) to the conversations object.
  6. It fetches all active connection IDs from DynamoDB.
  7. It broadcasts the assistant’s message to all active connections.
export const message = async (event: APIGatewayProxyEvent) => {
const { prompt, conversationId }: MessageBody = JSON.parse(event.body!);

console.log('prompt', prompt);

if (!conversations[conversationId]) {
conversations[conversationId] = [
{ role: 'system', content: '중고나라에서 물품 판매하는 사람처럼 행동하라. 너는 이제 구매자와 대화하게 될 것이다. 구매자와 중고 물건을 판매한다고 생각하고 최대한 높은 가격에 팔기 위해 노력하라. 너는 동네 50대 한국인 아줌마처럼 생각하고 행동하고 답해야 한다. 경어체를 쓰지 말고 반말체를 사용하라.' },
];
}

conversations[conversationId].push({ role: 'user', content: prompt });

const completion = await openai.createChatCompletion({
model: 'gpt-3.5-turbo-16k',
messages: conversations[conversationId],
temperature: 0.6,
max_tokens: 100,
});

const openaiResponse = completion.data.choices[0].message.content;

conversations[conversationId].push({ role: 'assistant', content: openaiResponse });

console.log('openaiResponse', openaiResponse);

const dbResult = await new DynamoDB()
.scan({
TableName: 'ConnectionIds',
ProjectionExpression: 'connectionId',
})
.promise();

const api = newApiGatewayManagementApi({
endpoint: event.requestContext.domainName! + '/' + event.requestContext.stage!,
});

console.log('dbResult', dbResult);

await Promise.all(
dbResult.Items.map(({ connectionId }) =>
api
.postToConnection({
ConnectionId: connectionId.S,
Data: JSON.stringify({
message: openaiResponse,
conversationId,
}),
})
.promise(),
),
);

console.log('api', api);

return {
statusCode: 200,
body: JSON.stringify({ msg: 'OK' }),
};
};

The createChatCompletion function is then invoked with these messages. The OpenAI response is pushed back to the conversation, and the updated message is sent to all connected clients via API Gateway Management API. This operation enables real-time, two-way communication between the user and the AI model.

The OpenAI model chosen here is gpt-3.5-turbo-16k, which is an advanced version of GPT-3.5, known for its impressive language generation capabilities. The code has also set temperature and max_tokens parameters to control the output’s randomness and length, respectively.

This architecture allows for multiple concurrent conversations with different users. Each user will receive responses from the OpenAI model based on their conversation history, and all responses will be broadcasted to all active connections.

AWS API Gateway Management API

A crucial part of this function is interacting with API Gateway’s Management API. AWS provides the ApiGatewayManagementApi, a WebSocket API client included in the aws-sdk. It is used to send the AI-generated response back to the connected clients. The implementation is done using the aws-apigatewaymanagementapi library, which adds the missing constructor to the AWS SDK, allowing easier interaction with the Gateway Management API.

The API sends peer messages using a connectionId. To respond to requests, the manageConnection must vary according to each WebSocket endpoint. Therefore, it is essential to allow the endpoint to have the appropriate address. This address is generally managed through configurations rather than code, which can conveniently utilize the event.requestContext object.

Typically, this would be event.requestContext.domainName + / + event.requestContext.stage, representing the constructed PROTOCOL://API-ID.execute-api.REGION.amazonaws.com/STAGE. When using a custom domain name with API Gateway, this path must be modified.

This function iterates over all the connectionIDs in the database, sending the OpenAI response to all connected clients.

Let’s connect the functions that could be written to the appropriate WebSocket routes which are defined atserverless.yaml

functions:
connect:
handler: handler.connect
events:
- websocket:
route: $connect
disconnect:
handler: handler.disconnect
events:
- websocket:
route: $disconnect
broadcast:
handler: handler.broadcast
events:
- websocket:
route: $default

Using CloudFormation expressions, we can declare the DynamoDB table used in this service. As such, the required DynamoDB table is created along with the deployment of this service and is removed when the service is removed. More accurately, the DynamoDB table is included in the CloudFormation stack composed of API Gateway, Lambda, and CloudWatch for logging and managing together.

resources:
Resources:
ConnectionTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: ConnectionIds
AttributeDefinitions:
- AttributeName: connectionId
AttributeType: S
KeySchema:
- AttributeName: connectionId
KeyType: HASH
ProvisionedThroughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5

In the provider section, declare IAM permissions to allow each Lambda to access DynamoDB and WebSocket API. To keep the example as concise as possible, permissions were given a very wide range (*). This practice should be avoided in a real-service scenario.

provider:
name: aws
runtime: nodejs10.x
iamRoleStatements:
- Effect: "Allow"
Action:
- "execute-api:ManageConnections"
Resource: "*"
- Effect: "Allow"
Action:
- "dynamodb:*"
Resource: "*"

In the setup, I used the Serverless framework and the serverless-esbuild plugin. The AWS provider is configured with a Node.js runtime and is running in the us-east-1 region. We define the necessary IAM roles to interact with DynamoDB and our WebSocket routes — connect, disconnect, and message.

// https://github.com/Web3-Study-with-Sigrid-Jin/serverless-chatgpt-ws/blob/dev/serverless.ts
import type { AWS } from '@serverless/typescript';

const serverlessConfiguration: AWS = {
service: 'websocket-chat',
frameworkVersion: '3',
plugins: ['serverless-esbuild'],
provider: {
name: 'aws',
runtime: 'nodejs14.x',
region: 'us-east-1',
stage: 'dev',
apiGateway: {
minimumCompressionSize: 1024,
shouldStartNameWithService: true,
},
environment: {
AWS_NODEJS_CONNECTION_REUSE_ENABLED: '1',
NODE_OPTIONS: '--enable-source-maps --stack-trace-limit=1000',
},
iam: {
role: {
statements: [
{
Effect: 'Allow',
Action: [
'dynamodb:PutItem',
'dynamodb:GetItem',
'dynamodb:DeleteItem',
'dynamodb:Scan',
],
Resource: {
'Fn::Sub': [
'arn:aws:dynamodb:${region}:*:table/ConnectionIds',
{
region: '${aws:region}',
},
],
},
},
],
},
},
},
functions: {
connect: {
handler: 'src/functions/hello/handler.connect',
events: [
{
websocket: {
route: '$connect',
},
},
],
},
disconnect: {
handler: 'src/functions/hello/handler.disconnect',
events: [
{
websocket: {
route: '$disconnect',
},
},
],
},
message: {
handler: 'src/functions/hello/handler.message',
events: [
{
websocket: {
route: 'message',
},
},
],
},
},
package: { individually: true },
custom: {
esbuild: {
bundle: true,
minify: false,
sourcemap: true,
exclude: ['aws-sdk'],
target: 'node14',
define: { 'require.resolve': undefined },
platform: 'node',
concurrency: 10,
},
},
};

module.exports = serverlessConfiguration;

Deployment

Building a serverless application using the AWS stack is simple. Just install SLS binary program, then starts creating a serverless application using the sls create command, followed by deploying the app using serverless deploy.

Now that all code has been written, it can be deployed for testing. Ensure that your AWS credentials are correctly configured. If your settings use the default profile, there’s no need to worry. If you’re managing multiple profiles, you can use the AWS_PROFILE environment variable appropriately or use --aws-profile PROFILE-NAME when using sls in the future.

Use the sls deploy command to deploy. This will pack your code with Webpack, zip it, upload it to an S3 bucket, and create a CloudFormation stack. After the deployment is complete, the console will display the URL of the WebSocket API endpoint, which can be used to connect to the WebSocket server.

Use any WebSocket client to connect to the server using the provided URL, and it should operate successfully. Please note that when connecting, any message can be sent since a broadcast is currently set as the default route.

You’ve successfully created an AWS Serverless WebSocket Server that communicates with the GPT-3.5-turbo model!

--

--

Sigrid Jin

Software Engineer at Sionic AI / Machine Learning Engineer, Kubernetes. twitter.com/@sigridjin_eth