Auto-Focusing Faces in Web-Cam Videos in Next.js

Eugene Musebe

Introduction

In this article, we will create a function that detects a person's face in a scene and maintains focus on it while they move. When recording a webcam video with people, it allows for more focal freedom.

Codesandbox

Check the sandbox demo on Codesandbox.

You can also use the Github repo here.

Prerequisites

Entry-level javascript and React/Nextjs knowledge.

Setting Up the Sample Project

Create a new nextjs app using npx create-next-app webcamfocus in your terminal. Head to your project root directory cd webcamfocus

To set up Cloudinary integration, start by creating your Cloudinary account using Link and logging in to it. You will receive a dashboard containing environment variable keys which are necessary for the Cloudinary integration in our project.

In your project directory, start by including Cloudinary in your project dependencies npm install cloudinary Create a new file named .env.local and paste the following code. Fill the blanks with your environment variables from the Cloudinary dashboard.

1".env.local"
2
3CLOUDINARY_CLOUD_NAME =
4
5CLOUDINARY_API_KEY =
6
7CLOUDINARY_API_SECRET =

Restart your project: npm run dev.

Create a directory pages/api/upload.js and begin by configuring the environment keys and libraries.

1var cloudinary = require("cloudinary").v2;
2
3cloudinary.config({
4 cloud_name: process.env.CLOUDINARY_NAME,
5 api_key: process.env.CLOUDINARY_API_KEY,
6 api_secret: process.env.CLOUDINARY_API_SECRET,
7});

Use a handler function to fire the POST request. The function will receive media file data and post it to the cloudinary website, capture the media file's cloudinary link and send it back to the frontend as a response.

1export default async function handler(req, res) {
2 if (req.method === "POST") {
3 let url = ""
4 try {
5 let fileStr = req.body.data;
6 const uploadedResponse = await cloudinary.uploader.upload_large(
7 fileStr,
8 {
9 resource_type: "video",
10 chunk_size: 6000000,
11 }
12 );
13 url = uploadedResponse.url
14 } catch (error) {
15 res.status(500).json({ error: "Something wrong" });
16 }
17
18 res.status(200).json({data: url});
19 }
20}

Our front end will be coded in the pages/index directory:

Start by including @cloudinary/react and @cloudinary/url-gen in your project dependancies: npm install @cloudinary/url-gen @cloudinary/react.

In the pages/index directory include the necessary modules in your imports

1"pages/index"
2
3
4import React, { useRef, useState } from "react";
5import { AdvancedVideo } from "@cloudinary/react";
6import { Cloudinary } from "@cloudinary/url-gen";
7
8import { fill } from "@cloudinary/url-gen/actions/resize";
9import { FocusOn } from "@cloudinary/url-gen/qualifiers/focusOn";
10import { Gravity } from "@cloudinary/url-gen/qualifiers";
11import { AutoFocus } from "@cloudinary/url-gen/qualifiers/autoFocus";
12
13const HTTP_SUCCESS = 200;
14const VIDEO_HEIGHT = 450;
15const VIDEO_WIDTH = 800;

The three constants below the imports shall be used to determine a successful response, video height, and width respectively.

Create a cloudinary instance

1"pages/index"
2
3
4const cld = new Cloudinary({
5 cloud: {
6 cloudName: "hackit-africa",
7 },
8});

Inside the home component, declare the following variables. We will use them to link our video element to the webcam through a media stream

1"pages/index"
2
3
4export default function Home() {
5 let recordedChunks = [];
6 let localStream = null;
7 let options = { mimeType: "video/webm; codecs=vp9" };
8 let mediaRecorder = null;
9
10 const rawVideo = useRef();
11 const [publicID, setPublicID] = useState("bcffgeg9cnjryfqdghz8");
12
13 return(
14 <>works</>
15 )
16}

Create a function startCamHandler like below:

1"pages/index"
2
3
4const startCamHandler = async () => {
5 console.log("Starting webcam and mic ..... ");
6 localStream = await navigator.mediaDevices.getUserMedia({
7 video: true,
8 audio: false,
9 });
10 rawVideo.current.srcObject = localStream;
11 rawVideo.current.addEventListener("loadeddata", (ev) => {
12 console.log("loaded data.");
13 });
14
15 mediaRecorder = new MediaRecorder(localStream, options);
16 mediaRecorder.ondataavailable = (event) => {
17 console.log("data-available");
18 if (event.data.size > 0) {
19 recordedChunks.push(event.data);
20 }
21 };
22 mediaRecorder.start();
23};

The above function will first seek the user's permission to open the webcam and mic and feed the video element with data from the webcam using a media stream. All data chunks will be saved in the recordedChunks array.

There'll also be a function stopCamHandler which will stop the media stream on the user's command, turn the recorded chunks to blob and use the file reader, (readFile function) to get a string representation of the media file and pass it to the uploadHandler function.

1"pages/index"
2
3
4function readFile(file) {
5 console.log("readFile()=>", file);
6 return new Promise(function (resolve, reject) {
7 let fr = new FileReader();
8
9 fr.onload = function () {
10 resolve(fr.result);
11 };
12
13 fr.onerror = function () {
14 reject(fr);
15 };
16
17 fr.readAsDataURL(file);
18 });
19}
20
21const stopCamHandler = () => {
22 console.log("Hanging up the call ...");
23 localStream.getTracks().forEach((track) => track.stop());
24
25 mediaRecorder.onstop = async (event) => {
26 let blob = new Blob(recordedChunks, {
27 type: "video/webm",
28 });
29
30 // Save original video to cloudinary
31 await readFile(blob).then((encoded_file) => {
32 uploadVideo(encoded_file);
33 });
34 };
35};

We will use the code below to fire a cloudinary transformation that focuses on the user's face as the video plays.

1"pages/index"
2
3
4const myVideo = cld.video(publicID);
5 // Apply the transformation.
6 myVideo.resize(
7 fill()
8 .width(VIDEO_WIDTH)
9 .height(VIDEO_HEIGHT)
10 .gravity(
11 Gravity.autoGravity().autoFocus(AutoFocus.focusOn(FocusOn.faces()))
12 )
13 );

Finally, we will have the uploadHandler function to post the videos to the backend and use the response to set the video's public Id using the setPublicID state hook.

1"pages/index"
2
3
4const uploadVideo = async (base64) => {
5 console.log("uploading to backend...");
6 try {
7 fetch("/api/upload", {
8 method: "POST",
9 body: JSON.stringify({ data: base64 }),
10 headers: { "Content-Type": "application/json" },
11 }).then((response) => {
12 if (response.status === HTTP_SUCCESS) {
13 response.json().then((result) => {
14 console.log(result);
15 setPublicID(result.public_id);
16 });
17 }
18 console.log("successfull session", response.status);
19 });
20 } catch (error) {
21 console.error(error);
22 }
23};

Use the code below to fill the DOM elements in your return statements. The css is in the Github repo.

1"pages/index"
2
3
4return (
5 <div className="container">
6 <h1>Auto-focusing faces in web-cam videos using next js</h1>
7 <div className="row">
8 <div className="column">
9 <video
10 className="display"
11 width={VIDEO_WIDTH}
12 height={VIDEO_HEIGHT}
13 ref={rawVideo}
14 autoPlay
15 playsInline
16 />
17 </div>
18 <div className="column">
19 {publicID && <AdvancedVideo cldVid={myVideo} controls />}
20 </div>
21
22 </div>
23 <div className="row">
24 <div className="column">
25 <div className="buttons">
26 <button className="button" onClick={startCamHandler}>
27 Start Webcam
28 </button>{' '}
29 <button id="close" className="button" onClick={stopCamHandler}>
30 Close and upload original video
31 </button>
32 </div>
33 </div>
34 </div>
35 </div>
36);

The full UI should look like below:

Eugene Musebe

Software Developer

I’m a full-stack software developer, content creator, and tech community builder based in Nairobi, Kenya. I am addicted to learning new technologies and loves working with like-minded people.