The Complete Guide to Preparing for a High-Frequency Trading FPGA Developer Interview

We are working on a course that teaches an end to end implementation of all aspects of an HFT system described in this article. If this sounds interesting to you, please fill out this form to express the same. It will help us make choices that benefit most people.

Developing hardware for a High Frequency Trading firm has been and currently is one of the hottest job descriptions for FPGA engineers. Thanks to the exceptional compensation offered by this industry, it might even be called the 'Dream Job' for many in this area. Having worked in this job role in the past, in this article I write from the personal experience of myself and many others about what test job roles look like and how one might prepare themselves to crack that interview.

What the HFT world looks like:

So that you can understand the jargon and prepare in a more suitable way, let's first talk about what HFT is all about and why FPGAs have a place in the trading industry of all places.

High Frequency Trading is a variant of Algorithmic Trading which itself is a variant of automated Day Trading in which fundamentally an algorithm places the orders instead of a Human Being. The human writes and monitors the algorithm but doesn't manually place the trades. In HFT, all of this is pushed to the extreme by exploiting opportunities that may last only a few milliseconds, often Arbitrage Opportunities. In order to meaningfully exploit such opportunities and make money, one needs to digest the market data, make decisions and place a large number of orders in an extremely short period of time. This, is where it becomes a race.

In his amazing book Trading at the speed of Light, Makenzie explains how even in the Pit trading days in chicago, people adapted by creating systems that enabled some people to trade much faster than the others. Essentially, trading by design, is a game where information assymetry can be exploited, If in this moment in time you have some information that has not reached others, an assymetry exisists where you can predict what others are going to do once they get that information and yourself make desicions accoridingly. This assymetry is the basis of arbitrage opportunities and many other complex scenarios where money can be made. Thanks to the significant advances in the tech used by the exchanges themselves, these assymetries have become extremely rare and short lived.

However, the fundamental truth is that it's physically impossible to eliminate this assymetry purely due to the fact that some people who are closer to the exchange get their data faster than those who are further from it. Same goes with those who are able to transmit data faster (Microwave towers, Leased Optical Lines). Similarly, those who are able to make faster decisions after having recieved the data can also get ahead in this race. This, is the very problem an FPGA engineer is hired to help solve in an HFT Firm.

Exchanges allow firms to set up their computers in a specific building close to it. Whatever stock broker you trade with probably has a machine there. This is called colocation. The exchange ensures that the length of the optical fibre going from these machines to the exchange computer is the exact same for every player in the colo.

In this setting, there is no longer scope left to go any closer to the exchange. What can be done though is to minimize the amount of time it takes to digest data and send out orders. To achieve this, these servers have insane specs when it comes to processing speed, memory speed and latency. They are overclocked and optimized to the edge of what they can offer.

Despite these optimizations, a CPU fundamentally has two quirks that are not too HFT friendly.

Higher Execution Latency

The CPU needs to break down algorithms in to ISA and execute it sequentially, fast as it may be, this process has a lot of overhead with several interacting elements in the critical path.

Indeterministic Execution Latency

The trading algorithm is not the only thing running on the CPU. The operating system manages so many processes and threads together. This could lead to the same exact procedure taking different amounts of time in different executions. This can be a huge problem for someone betting large amounts of money on a guarentee that the process will take some X amount of time. If it takes longer just this once, it could potentially mean a huge loss.

How FPGAs Help

FPGAs offer help with both the major problems mentioned above. They almost always solve the Indeterminism. It can be known before hand exactly how many clock cycles a certain kind of operation is going to take in FPGA silicon and also ensured that it will not change throughout the operation.

They also often (not always) help with the other problem of high processing latency. In the entire pipeline of Decode Market Data -> Make Decisions -> Send Orders several sub-functions could use FPGA based acceleration.

FPGAs don't handle complex mathematics well, so engineers closely analyse their processing pipelines and tease out ways to achieve substantial acceleratino of selective processes. Following is what a trading machine interfaced with an FPGA looks like:

hft_arch

As of today, there are many peopular boards in the market that provide a workbench for HFT related FPGA development by taking care of some of the tough blocks like Ethernet and PCIe like this.

Interview Prep

Now that you have a good mental image of what this world looks like, I hope your interest in working with this cutting edge tools has only increased. Let's talk about the skills you need in your day to day job as a HFT FPGA Engineer.

Digital Eletronics and RTL Design:

Proficiency in digital logic, Verilog, and microarchitecture development.

  1. These are fundamental. One needs to have a demonstrated experience in writing RTL, debugging it and deploying it on the board. Interviews are often structured in a way that candidates are required to write RTL on the spot.
  2. The problem statements can also be quite complex and puzzle like , unlike that of a traditional FPGA job let’s say in the networking or defence industry. This tests both the IQ as well as RTL skills of the candidate.
  3. Many firms also give an assignment that needs to be done within a few hours of being given. This often includes implementing the problem statement in RTL and writing code to verify it.
  4. There is a very high emphasis on writing code that takes less cycles to do a certain task, this often requires brainstorming different ways to do the same thing and coming up with a low latency alternative while keeping it reasonable enough to be implemented within the timing constraints.

    State machine coding

  5. State machines are another very common block found in these implementations. This article has a bunch of resources to effectively prepare for an RTL design interview.

    FIFOs:

  6. FIFOs will be everywhere, synchronous, asynchronous all kinds. Can you answer a question on what parameters to consider while deciding how to rightly size a FIFO? The RTL interview prep article has a section on FIFOs if you need guidance.

Verification Fundamentals:

  1. As already mentioned, FPGA teams are quite small in HFTs, so a single developer has to wear many hats to deliver a project. Also, thanks to the sizes of the trades (millions of dollars at a time), a single bug in your design could lead to losses that span a few years of your compensation.
  2. So verification becomes a very important task. Despite the luxury of FPGAs that can be reprogrammed any number of times, the stakes are high and all care should be taken to ensure no bugs go throught.
  3. Being confortable with simulations and debugging through waverofrms is an absolute necessity.
  4. Writing efficient test benches and testcases for your code is an invaluable skill. For the more experienced folks, knowledge of System Verilog and UVM might be a plus point.
  5. The important point to note is that both design and verification need to be handled by the same person and both skills need to be showcased.

Networking Fundamentals:

  1. The actual projects implemented on the job often revolve around networking.
  2. The incoming market data comes through a TCP channel, the TCP connecttion needs to be maintained by the trading machine, this could be another area of potential acceleration.
  3. Similarly there could be a requirement to filter some data and send it out to a different place (often another exchange) by packing it into packets. TCP or UDP might be used for this part of the communication, which again is a good avenue for acceleration.
  4. Having basic knowledge of the Ethernet protocol and it's details will go a long way in keeping you ahead in the race.

Ethernet, PCIe and Network Architecture (Optional,Advanced):

  1. Many bigger firms with bigger pockets take up the challenge of implementing the ethernet MAC layer by themselves and the PCIe stack too. But most others directly use the proven IP that comes with most of the boards mentioned above. Below skills will only be required for a handful of big firms.
  2. For ethernet, there's an entire world of knowledge to be acquired, understanding of SERDES, PHY vs PCS layer functios, MAC layer functions and potential optimizations for specific purposes. It is very hard to get this knowledge and experience without having worked on an ethernet implementation in an industry setting. But something like this paper can give you some beginner insights.
  3. PCIe falls into the same category. It's an extensive protocol with lot's of details and documentation. A beginner is almost never expected to know these unless they have directly worked on it. However, having some big picture idea is always good.
  4. Similarly, the FPGA isn’t the only machine on the network, there are many other things like switches, NICs, other trading machines etc that could inter-operate while making a decision. Having a sound knowledge of the network architecture and what areas one might tweak to get that tiny bit of latency advantage can be a major plus point. Obviously, these are never expected out of someone who’e never worked on HFT or Networks before and you will not be at a disadvantage if you don't know these things.

Implementation Skills:

Static Timing analysis (STA)

  1. For the uninitated, Static timing analysis is a way of testing if your logic will meet the timing constraints of the FPGA and reliable function. As mentioned above, This Article will help you with resources and basics about this topics.
  2. This is extremely important. Often, due to the extreme sensitivity to latency, RTL for HFT applications is written with very low pipelining, doing a lot of work in less clocks. This obviously causes the implementation tool to place the LUTs very close to each other in order to improve the slack.
  3. Once the RTL coding is done and verified, it is a common occurrence that the implementation tools are unable to meet timing on the FPGA board for your design thanks to this congestion. Good knowledge of STA is required to play around with the logic and re-structure it to try and meet timing.
  4. As a part of this, you should also be ideally be able to write timing constraints for the implementation. Start from the basics, let's say you have two clocks in your design, what constraints would you need to add for them in the .xdc file? What about IOs ? ...

Clock Domain Crossing (CDC)

  1. This is a fundamental skill any RTL designer needs HFT or not. Almost every interviewer will ask about the different CDC techniques you know about once they are done with STA. They might even ask you why one way is better than the other or maybe even to code a few.
  2. Go read this article on CDC. It has everything on the topic you’ll ever need.
  3. Again constraints. Let's say you performed a CDC using a double flop synchronizer. What constraints would you add to your .xdc file for this element? what about an AFIFO?

Area Optimization

  1. This is closely tied to the timing optimisation challenge already mentioned.
  2. Designs for HFT applications are usually not very large, they don’t fill the major part of the FPGA, but the logic elements are placed so close to each other that the congestion causes all sorts of problems. The only way to solve this is to have a good idea of exactly what is being inferred by your RTL and how can you modify it such that the congestion reduces.
  3. Often, a good knowledge of the architecture of the FPGA and it’s fundamental elements (LUTs, BRAMs, DSPs etc) guides you in these decisions.
  4. Additionally, instead of just writing RTL and simulating it, taking it all the way through the implementation flow (synthesis, place and route etc) will give you a lot more insight into what ends up happening with your RTL by the time it reaches the silicon. The next step is to try and control this 'inference' in a way that will reduce resource utilization and area congestion on the device.

Hardware Testing

Networking Knowledge

Go read this free book end to end and try to practice everything on a linux based computer. You will have a great advantage with this knowledge.

Ethernet Hands-On Knowledge

  1. At the risk of being too repititive, ethernet knowledge is the bread and butter for this job. You will often be expected to capture packets over the network, open them up using a tool like wireshark, interpret what’s in them to debug the behaviour of your system.
  2. There are many linux commands too that you will be using in this process, you can try to practice that too. You can use these resources link1, link2, link3

Debugging on Hardware

Often things will work in simulation, but not in hardware, and you’ll need to find out why. This is another special skill that distiguishes a good engineer from a mediocre one. Having projects on your resume that show clearly that you had earlier put your designs on actual hardware and debugged issues on it will go a long way in showing your comfort with this environment. Some of this has to be enabled at the RTL design stage itself, wherein you are required to put debugging logic up-front in your design so that you can have some level of observability into the state of the hardware. Another very common question in RTL design interviews.

If you don't own an FPGA board where you can go through this flow, we got you covered with chiprentals, a service we started to provide free access to real FPGA environments for students and practioners to work on. Go check it out

C++ and Drivers

  1. In an embeded system of any kind, which is what the trading machine with an FPGA tied to it is, one needs to write drivers that talk to hardware. As the FPGA developer, you will be expected to do this atleast to some extent that is enough to test your design on real hardware and serve as a proof of concept for the actual software guys to build up on.
  2. This skill mainly involves being comfortable with C and C++, Pick up any hardware development board you can, maybe even an Arduino, and get used to writing code that controls underlying hardware.
  3. The preferred project would be something on a linux based environment. Our service Chiprentals, provides an effortless way for you to access an environment like the one you would find in a real deployment.

Knowledge of Finance and Trading

This is not required and you will not be expected to know this or learn it.

DSA / Computer Science Skills

Often while writing any code, to process data or test your hardware, you will be writing large pieces of software. Well written code can make a large difference in the ammount of time you take to get it working and finish the task at hand. Thanks to the great pay offered by these firms, the also demand the best work. In that spirit, I would not be surprised if the interviewer asks you a few problems related to data structures and algorithms.

Puzzles

They often spend time on puzzles to test your IQ. Most of the time these don’t carry much weightage in the interview but that’s subjective.

Resources:

The interview prep guide has your back on the basics of RTL and FPGA design mentioned in this article. The book mentioned earlier Trading at the speed of Light is a really fun read that gives a lot of insight into this world even if you don't end up working there.

As for topics around networking and implementations of a realistic HFT system, We are working on a course that delves into the depths of the topics listed here. There shall be an end to end implementation of a dummy HFT system from scratch. If you think this interests you. Please fill this form and express the same. Your feedback will help us make the right choices and get the right content out faster.

Batman

I'm Batman, a silent nerd and a watchful engineer obsessed with great Technology. Get in touch via the Discord community for this site

I did away with ads because they work neither for me nor my readers. However, if you like my work and want to keep me motivated to write more. Consider