Aug 03, 2023
16 min read

How to Use Node-Schedule to Build and Deploy a Hacker News Aggregator

Introduction

Modern software applications perform a multitude of recurring tasks, such as processing and exchanging data, generating reports, and more. While some of these tasks require specific timing, others take longer to complete and cannot be executed on the application's primary thread. Consequently, a scheduling mechanism is necessary to execute these tasks efficiently.

Cron is a widely used scheduling tool that allows users to automate and schedule recurring tasks at specific intervals, dates or times. It helps eliminate manual execution and ensures the timely execution of commands and scripts. The node-schedule library is a flexible cron-like job scheduler for Node.js, offering cron-style, date-based, and recurrent rule scheduling.

In this tutorial, we will delve into the process of setting up and executing cron jobs in Node.js using the node-schedule library by building a Hacker News story aggregator. Upon completing this tutorial, you will have developed a Node.js script that gathers the ten most commented-on stories on Hacker News and subsequently forwards them via scheduled daily emails.

You can deploy and preview the node-schedule application from this guide using the Deploy to Koyeb button below:

Deploy to Koyeb

Note: Remember to replace the values of the environment variables with your own information (as described in the Mailgun and schedule email sections).

You can consult the repository on GitHub to find out more about the application that this guide builds.

Requirements

To successfully follow this tutorial, you need the following:

  • Node.js and npm installed on your development machine. The demo app in this tutorial uses Node.js version 18.16.1.
  • Git installed on your development machine.
  • A Mailgun account to send the emails.
  • A Koyeb account to deploy the application.

Set up the Mailgun account

Mailgun provides APIs for sending, receiving and tracking emails. In this section, you will log into your Mailgun account and get your API key and sandbox domain information.

While logged into your verified Mailgun account, retrieve your private API key by clicking the API keys link located on the right sidebar of the dashboard page. Copy the Private API key displayed on the page and store it securely for future use.

Next, select the Sending link in the left sidebar. This will direct you to the Domains page, where you can find your sandbox domain address presented in the following format:

sandbox<RANDOM_NUMBER>.mailgun.org

Be sure to copy and save this domain address in a safe place for your future reference.

For testing purposes, Mailgun restricts sending emails to authorised email addresses only. To authorize an email address, click your domain URL on the Domains page to access its Overview page. On the right sidebar of the Overview page, enter the email address you want to send test emails to in the Authorized Recipients input field and click the Save Recipient button.

Mailgun will send a verification email to the provided address. In the verification email, click the I Agree button to complete the authorisation process. If you refresh the page in Mailgun, you see that the target email address is now marked as verified.

You now have all of the information you need to send emails using your Mailgun account. In the next section, you will set up the project and install the necessary libraries and dependencies.

Set up the project

In this section, we will set up an npm project with TypeScript and install all necessary dependencies to build the Hacker News story aggregator.

To begin, create a project root directory on your development machine by typing:

mkdir -p hn_post_aggregator/src

The hn_post_aggregator directory serves as the main directory for the demo application. The src directory nested within will hold the actual project code.

Next, execute the commands below to initialise a Git repository in the newly created hn_post_aggregator directory:

cd hn_post_aggregator
git init

The first command above changes your terminal's current working directory to the hn_post_aggregator directory, while the second command initializes a Git repository there.

Next, generate an npm project within the hn_post_aggregator directory by typing:

npm init -y

This will initialize an npm project using default values and creates a package.json file in the project's root directory.

Having initialized the npm project, proceed to install the necessary libraries and packages for building the Hacker News story aggregator:

npm install mailgun.js node-schedule dotenv axios
npm install --save-dev typescript nodemon ts-node @types/node-schedule

The npm install command installs the specified libraries and type definitions. The first command installs dependencies required to run the demo application, while the second command installs development-related dependencies.

The dependencies installed include:

Additionally, the libraries installed for development purposes include:

  • typescript: Enables the execution of TypeScript code.
  • nodemon: Detects code changes to restart the application during development.
  • ts-node: To execute and rebuild TypeScript efficiently.
  • @types/node-schedule: Type definitions for node-schedule.

Afterwards, create a tsconfig.json file in the root directory of the project, and insert the following code into the file:

{
  "compilerOptions": {
    "target": "es2016",
    "module": "commonjs",
    "esModuleInterop": true,
    "forceConsistentCasingInFileNames": true,
    "strict": true,
    "skipLibCheck": true
  },
  "include": ["src/**/*.ts"],
  "exclude": ["node_modules"]
}

The tsconfig.json file indicates that this is a TypeScript project and defines the necessary options to compile the project.

With that last step, the project setup is complete. In the next section, you will start building the Hacker News story aggregator by adding the ability to retrieve the top Hacker News stories.

Retrieve top Hacker News stories

Hacker News offers a publicly available API that allows access to its stories, comments, polls, and other content on its platform in near real-time. The API provides access to these resources, known as "items", through their respective ids.

Furthermore, the API provides an endpoint for accessing a list of item IDs for 500 top, best, and new stories. This endpoint makes it possible to fetch the Hacker News stories with the highest number of comments.

To fetch the top ten most commented-on Hacker News stories, start by creating an index.ts file in the src directory and add the following code to the file:

import axios from 'axios'

type ItemType = 'job' | 'story' | 'comment' | 'poll' | 'pollopt'
type Item = {
  id: number
  type?: ItemType
  by?: string
  time?: number
  text?: string
  parent?: number
  poll?: number
  kids?: number[]
  url?: string
  score?: number
  title?: string
  descendants?: number
}

// fetch top 10 Hacker News posts
const fetchTopHNPosts = async (): Promise<Item[]> => {
  try {
    const response = await axios.get('https://hacker-news.firebaseio.com/v0/topstories.json')
    const topStoriesPromises = response.data.map((id: number) =>
      axios.get(`https://hacker-news.firebaseio.com/v0/item/${id}.json`)
    )

    const posts = await Promise.all(topStoriesPromises)
    const postData: Item[] = posts.map((post: any) => post.data)
    return postData.sort((a, b) => (b.descendants || 0) - (a.descendants || 0)).slice(0, 10)
  } catch (error) {
    console.error('Error fetching Hacker News posts: ', error)
    return []
  }
}

// test it out
fetchTopHNPosts().then((val) => console.log(val))

The above code imports the axios library and defines an Item object type representing a Hacker News "item".

Next it defines a fetchTopHNPosts function which makes two HTTP requests: one to retrieve a list of IDs for the top stories on Hacker News and another to fetch the corresponding items associated with those IDs.

The second HTTP request's response is sorted in descending order according to the number of comments they've received, and then the top ten items from the sorted list are returned. Finally, to test the fetchTopHNPosts function, we invoke the function and display the return value in the console.

To test out the code, update the scripts section of your project's package.json with the following line:

...
"scripts": {
  "dev": "nodemon --watch './**/*.ts' --exec ts-node ./src/index.ts",
  "test": "echo \"Error: no test specified\" && exit 1"
}
...

The dev command added above runs the code in your index.ts file using nodemon and ts-node. To execute it, run the command below in your terminal window:

npm run dev

If everything works correctly, you should see an output like the one below in your terminal window:

;[
  {
    by: 'andrewl',
    descendants: 705,
    id: 36946104,
    kids: [
      36948015, 36947154, 36948748, 36948416, 36947573, 36946860, 36952644, 36948697, 36947700, 36947315,
      36947085, 36949982, 36947266, 36954534, 36947176, 36950587, 36948479, 36950451, 36950189, 36949383,
      36948032, 36956363, 36948533, 36953131, 36948458, 36946797, 36950395, 36948762, 36948212, 36946542,
      36948910, 36946742, 36954116, 36952678, 36948357, 36950981, 36948184, 36950921, 36958411, 36953938,
      36949245, 36948969, 36952406, 36953223, 36948920, 36949922, 36949267, 36949079, 36951666, 36951065,
      36948452, 36951701, 36951711, 36946661, 36948887, 36951512, 36950440, 36948284, 36949331, 36947263,
      36954150, 36950667, 36951132, 36947884, 36951909, 36951376, 36951020, 36950806, 36948595, 36949843,
      36949733, 36949145, 36948863, 36948130, 36947599, 36948923, 36948293, 36948572, 36946884, 36947198,
      36947123, 36948756, 36946953, 36950158,
    ],
    score: 378,
    time: 1690825015,
    title: 'Marijuana addiction: those struggling often face skepticism',
    type: 'story',
    url: 'https://www.washingtonpost.com/health/2023/07/31/marijuana-addiction-legal-recreational-sales/',
  },
]

Once you've confirmed that it is working correctly, open the src/index.ts file again and comment out the line executing the fetchTopHNPosts function. Moving forward, we will want this information formatted into an email instead of printing the results directly to the console.

In the next section, you will create an email containing the retrieved top ten stories with the highest number of comments. Furthermore, you will implement the necessary functionality to send this email using Mailgun.

Create email content

Now that we can generate the list of the top ten Hacker News stories with the most comments, we can compose an HTML email showcasing these stories and send it to specified recipients using Mailgun.

To get started, create an .env file in your project's root directory and add the following lines of code to the file, replacing the placeholder values with your own:

EMAIL="postmaster@<YOUR_MAILGUN_DOMAIN>"
API_KEY="<YOUR_MAILGUN_PRIVATE_API_KEY>"
DOMAIN="<YOUR_MAILGUN_DOMAIN>"
RECIPIENT_EMAIL="<YOUR_VERIFIED_RECIPIENT_EMAIL>"

Your .env file contains secret values and should be kept private. To ensure the file and its content aren't committed to Git, create a .gitignore file by typing:

printf "%s\n" ".env" "node_modules" "src/*.js" > .gitignore

The command above creates a .gitignore file containing lines that will exclude the .env file and the node_modules directory from the Git repository. It also excludes any JavaScript files generated from our TypeScript files since we want to compile those files each time we deploy.

Next, update the top of your index.ts file with the following code:

import axios from "axios";
import "dotenv/config";
import formData from "form-data";
import Mailgun from "mailgun.js";

const mailgun = new Mailgun(formData);
const client = mailgun.client({
  username: "api",
  key: process.env.API_KEY as string,
});
...

The code above imports the dotenv/config, form-data, and mailgun.js libraries and instantiates a new Mailgun client. The Mailgun client will handle sending composed emails.

Next, add the following lines to the bottom of the index.ts file:

...
const sendEmail = async (posts: Item[]): Promise<void> => {
  const html: string = `
  <h1>Top 10 Hacker News Posts with Most Comments</h1>
  <ul>
    ${posts
      .map(
        (post: Item) =>
          `<li><a href=${post.url}>${post.title} - ${
            post.descendants || 0
          } comments</a></li>`
      )
      .join("")}
  </ul>
  `;

  const messageData = {
    from: `"HN News ā˜ļø" <${process.env.EMAIL}>`,
    to: [process.env.RECIPIENT_EMAIL as string], // or array of emails
    subject: "Currently šŸ”„ on HN",
    html,
  };

  client.messages
    .create(process.env.DOMAIN as string, messageData)
    .then((res) => console.log("Email sent: ", res))
    .catch((err) => console.error("Error sending email: ", err));
};

The code above adds a sendEmail function which iterates through an array of Item objects and generates an HTML email content with a list of stories, their comment counts, and a link to view each story. In addition, the email message data (the sender and recipient's email, the email subject, and the content) is prepared and passed alongside your Mailgun domain as arguments to the Mailgun client. The client will attempt to send the email and logs the outcome to the console.

In the next section, we will schedule a cron job to fetch the top Hacker News stories and send them via email at a specific, designated time.

Schedule email

The necessary logic for retrieving the top ten most commented-on Hacker News stories and sending them through an email has been successfully implemented. In this section, you'll schedule a cron job to automate the execution of this code.

Cron jobs are managed through a table called crontab, which holds entries for each scheduled task. Each task on the crontab is entered using a cron syntax consisting of five fields delimited by spaces representing the schedule of when the task should run followed by the command to be executed:

// Cron syntax

* * * * * command_to_be_executed
ā”‚ ā”‚ ā”‚ ā”‚ ā”‚
ā”‚ ā”‚ ā”‚ ā”‚ ā””ā”€ā”€ā”€ Day of the week (0 - 7) (Sunday is 0 or 7)
ā”‚ ā”‚ ā”‚ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ Month (1 - 12)
ā”‚ ā”‚ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ Day of the month (1 - 31)
ā”‚ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ Hour (0 - 23)
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ Minute (0 - 59)

// An asterisk (*) in a field means "every" possible value for that field.

The node-schedule library handles the scheduling and execution of cron jobs and accepts both the cron syntax and other formats for time-based scheduling.

To schedule fetching and sending of the top ten most commented-on Hacker News stories, add the following import to the top of your index.ts file:

import schedule from 'node-schedule'

Next, add the code below to the end of your index.ts file:

// create a schedule for fetching and sending posts in an email
const executionSchedule = process.env.SCHEDULE || '*/2 * * * *'
schedule.scheduleJob(executionSchedule, async () => {
  console.log('=== fetching posts')
  const posts: Item[] = await fetchTopHNPosts()

  console.log('=== sending email')
  await sendEmail(posts)
})

The code above schedules a cron job using the node-schedule library's scheduleJob method. The method takes a cron syntax schedule from a SCHEDULE environment variable, falling back on a default value of "*/2 * * * *". The cron syntax "*/2 * * * *" defines the following schedule:

  • Minute: "*/2" means "every 2 minutes."
  • Hour: "*" means "every hour."
  • Day of the month: "*" means "every day of the month."
  • Month: "*" means "every month."
  • Day of the week: "*" means "every day of the week."

This means the cron job is scheduled to run every two minutes. Alternatively, you can use a JavaScript date object, object literal, or a node-schedule "recurrence rule" with the scheduleJob method for greater flexibility in scheduling tasks. Visit the node-schedule documentation to learn about other usage options.

When the job is triggered, the fetchTopHNPosts function is executed to gather the ten most commented-on Hacker News stories. These stories are then sent via email using the sendEmail function.

To test out the scheduling functionality, run the npm run dev command in your terminal window. After about two minutes, you should see output like the one below in your terminal window:

=== fetching posts
=== sending email
Email sent:  {
  status: 200,
  id: '<2023020402.63946c68e@sandbox9e0bf95e8b61.mailgun.org>',
  message: 'Queued. Thank you.'
}

After a few more minutes, you should have an email like the one below in your email:

Scheduled email

Note: The email might show up in your spam folder if your Mailgun domain is unverified.

You have now successfully scheduled a cron job to fetch and send the top ten most commented-on stories on Hacker News. In the next section, you will deploy your code to Koyeb.

Deploy to Koyeb

Since the Hacker News story aggregator operates solely in the background without a frontend UI, deploying it as a worker is the optimal choice. Fortunately, Koyeb allows the deployment of a wide range of services, including background workers.

To prepare the demo app for deployment, adjust the scripts section of your package.json file to include the following lines:

...
"scripts": {
  "dev": "nodemon --watch './**/*.ts' --exec ts-node ./src/index.ts",
  "build": "npx tsc",
  "start": "node src/index.js",
  "test": "echo \"Error: no test specified\" && exit 1"
}
...

The build command compiles the TypeScript code in your index.ts file into JavaScript in an index.js file. The start command executes the compiled code.

Next, create a GitHub repository for your code and run the following commands in your terminal window to commit and push your changes to the repository:

git add --all
git commit -m "Complete Hacker News story aggregator with cron scheduling."
git remote add origin git@github.com/<YOUR_GITHUB_USERNAME>/<YOUR_REPOSITORY_NAME>.git
git branch -M main
git push -u origin main

In the Koyeb control panel, navigate to the Secrets tab and enter a new Secret for each environment variable defined in your local .env file. Next, go to the Overview tab and click the Create Worker Service button to begin:

  1. Select GitHub as your deployment method.
  2. From the repository drop-down menu, select the repository that contains your code. Alternatively, you can deploy from the example repository associated with this tutorial by entering https://github.com/koyeb/example-node-schedule in the Public GitHub repository field.
  3. In the Environment variables section, click the Add variable button. For each environment variable defined in your .env file, create a corresponding variable. For the variable type, select Secret and then choose the matching Secret you defined earlier.
  4. To change the schedule of the cron job from running every two minutes after deployment, add a SCHEDULE environment variable and set its value to your preferred cron schedule. To help you generate the cron schedule easily, you can use an online cron syntax generator such as Cronitor's cron guru.
  5. Choose a name for your App and Service and click Deploy.

During the app deployment process, Koyeb detects and utilizes the build and start scripts specified in your package.json file to build and start the application. Throughout the deployment, you can monitor the progress with the displayed logs. After the deployment has finished and all necessary health checks have passed, your application will be operational.

At the scheduled time, the app will retrieve the top ten most commented-on stories on Hacker News and send them to the specified email address. Note that the scheduled task will execute according to the local time of the app's deployment region.

Conclusion

Congratulations! You have successfully created and deployed a Node.js cron job that collects the top ten most commented-on Hacker News stories and sends them via scheduled emails. Feel free to explore the other scheduling options offered by the node-schedule library to create more flexible schedules for your cron job.

As the worker was deployed using the Git deployment method, any new push to the deployed branch will automatically trigger a fresh build for your worker. Updates to your worker will go live as soon as the deployment successfully passes all required health checks. In the event of a deployment failure, Koyeb preserves the last functioning deployment in production, guaranteeing that your application remains continuously operational.


Deploy AI apps to production in minutes

Koyeb is a developer-friendly serverless platform to deploy apps globally. No-ops, servers, or infrastructure management.
All systems operational
Ā© Koyeb