Kafka - System Design Ultimatum

Diagram: Kafka · 6 elements

{ "type": "excalidraw", "version": 2, "source": "https://marketplace.visualstudio.com/items?itemName=pomdtr.excalidraw-editor", "elements": [ { "id": "hGhk6y9llk15-GCTy4Wbu", "type": "text", "x": 234, "y": 110, "width": 76.85997009277344, "height": 35, "angle": 0, "strokeColor": "#1e1e1e", "backgroundColor": "transparent", "fillStyle": "solid", "strokeWidth": 2, "strokeStyle": "solid", "roughness": 1, "opacity": 100, "groupIds": [], "frameId": null, "index": "a0", "roundness": null, "seed": 177688727, "version": 10, "versionNonce": 2071215961, "isDeleted": false, "boundElements": null, "updated": 1768629711552, "link": null, "locked": false, "text": "Kafka", "fontSize": 28, "fontFamily": 5, "textAlign": "left", "verticalAlign": "top", "containerId": null, "originalText": "Kafka", "autoResize": true, "lineHeight": 1.25 }, { "id": "bDsYXASp2a_UUiw_5CWn3", "type": "text", "x": 231.34579290333465, "y": 313.8645510160128, "width": 936.319091796875, "height": 100, "angle": 0, "strokeColor": "#1e1e1e", "backgroundColor": "transparent", "fillStyle": "solid", "strokeWidth": 2, "strokeStyle": "solid", "roughness": 1, "opacity": 100, "groupIds": [], "frameId": null, "index": "a3", "roundness": null, "seed": 2121809817, "version": 214, "versionNonce": 2107903609, "isDeleted": false, "boundElements": null, "updated": 1768629863260, "link": null, "locked": false, "text": "Streaming Platform: Kafka stores data as a stream of continuous records which can be processed in different methods.\n\nCommit Log: When you push data to Kafka it takes and appends them to a stream of records, \nlike appending logs in a log file or if you’re from a Database background like the WAL. \nThis stream of data can be “Replayed” or read from any point in time.", "fontSize": 16, "fontFamily": 5, "textAlign": "left", "verticalAlign": "top", "containerId": null, "originalText": "Streaming Platform: Kafka stores data as a stream of continuous records which can be processed in different methods.\n\nCommit Log: When you push data to Kafka it takes and appends them to a stream of records, \nlike appending logs in a log file or if you’re from a Database background like the WAL. \nThis stream of data can be “Replayed” or read from any point in time.", "autoResize": true, "lineHeight": 1.25 }, { "id": "6ES5vPxyKhrIzHuNC6umc", "type": "text", "x": 237.7895357092445, "y": 174.54848475725998, "width": 246.03172302246094, "height": 100, "angle": 0, "strokeColor": "#1e1e1e", "backgroundColor": "transparent", "fillStyle": "solid", "strokeWidth": 2, "strokeStyle": "solid", "roughness": 1, "opacity": 100, "groupIds": [], "frameId": null, "index": "a4", "roundness": null, "seed": 1143679991, "version": 75, "versionNonce": 995912057, "isDeleted": false, "boundElements": null, "updated": 1768629861393, "link": null, "locked": false, "text": "Can be used as\n- FIFO queue \n- Pub/ Sub messaging system\n- a real-time streaming platform\n- Database ", "fontSize": 16, "fontFamily": 5, "textAlign": "left", "verticalAlign": "top", "containerId": null, "originalText": "Can be used as\n- FIFO queue \n- Pub/ Sub messaging system\n- a real-time streaming platform\n- Database ", "autoResize": true, "lineHeight": 1.25 }, { "id": "KRO1151xAi81U1zTHOdUX", "type": "text", "x": 223.1921719294786, "y": 469.53687780669566, "width": 958.9910888671875, "height": 80, "angle": 0, "strokeColor": "#1e1e1e", "backgroundColor": "transparent", "fillStyle": "solid", "strokeWidth": 2, "strokeStyle": "solid", "roughness": 1, "opacity": 100, "groupIds": [], "frameId": null, "index": "a5", "roundness": null, "seed": 1667271161, "version": 22, "versionNonce": 2029558743, "isDeleted": false, "boundElements": null, "updated": 1768629961812, "link": null, "locked": false, "text": "Partitions is analogous to shard in the database and is the core concept behind Kafka’s scaling capabilities. \n\nOffset: the first block is at the 0th offset and the last block would on the (n-1)th offset. \nThe performance of the system also depends on the ways you set up partitions, we will look into that later in the article.", "fontSize": 16, "fontFamily": 5, "textAlign": "left", "verticalAlign": "top", "containerId": null, "originalText": "Partitions is analogous to shard in the database and is the core concept behind Kafka’s scaling capabilities. \n\nOffset: the first block is at the 0th offset and the last block would on the (n-1)th offset. \nThe performance of the system also depends on the ways you set up partitions, we will look into that later in the article.", "autoResize": true, "lineHeight": 1.25 }, { "id": "dQipQzMf8Ew7etxmmwzdR", "type": "text", "x": 233.37849431785105, "y": 609.2350134186607, "width": 1378.383056640625, "height": 140, "angle": 0, "strokeColor": "#1e1e1e", "backgroundColor": "transparent", "fillStyle": "solid", "strokeWidth": 2, "strokeStyle": "solid", "roughness": 1, "opacity": 100, "groupIds": [], "frameId": null, "index": "a6", "roundness": null, "seed": 950827095, "version": 16, "versionNonce": 1059439577, "isDeleted": false, "boundElements": null, "updated": 1768630057286, "link": null, "locked": false, "text": "Producer\n\nNo Key specified => When no key is specified in the message the producer will randomly decide partition and would try to balance the total number of messages on all partitions.\n\nKey Specified => When a key is specified with the message, then the producer uses Consistent Hashing to map the key to a partition. \n\nCustom Partitioning logic => We can write some rules depending on which the partition can be decided.", "fontSize": 16, "fontFamily": 5, "textAlign": "left", "verticalAlign": "top", "containerId": null, "originalText": "Producer\n\nNo Key specified => When no key is specified in the message the producer will randomly decide partition and would try to balance the total number of messages on all partitions.\n\nKey Specified => When a key is specified with the message, then the producer uses Consistent Hashing to map the key to a partition. \n\nCustom Partitioning logic => We can write some rules depending on which the partition can be decided.", "autoResize": true, "lineHeight": 1.25 }, { "id": "zQ5Wzqc21wAHmu0wLpBL7", "type": "text", "x": 241.45759063284888, "y": 842.1540853297869, "width": 1914.94287109375, "height": 780, "angle": 0, "strokeColor": "#1e1e1e", "backgroundColor": "transparent", "fillStyle": "solid", "strokeWidth": 2, "strokeStyle": "solid", "roughness": 1, "opacity": 100, "groupIds": [], "frameId": null, "index": "a7", "roundness": null, "seed": 739331353, "version": 3, "versionNonce": 1437702873, "isDeleted": false, "boundElements": null, "updated": 1768630661966, "link": null, "locked": false, "text": "Producer\nYou can send messages in 3 ways to Kafka.\n\nFire and forget\nSynchronous send\nAsynchronous send.\nAll of them have their performance vs consistency pitfalls.\n\nYou can configure characteristics of acknowledgment on the producer as well.\n\nACK 0: Don’t wait for an ack |FASTEST\nACK 1: Consider sent when leader broker received the message |FASTER\nACK All: Consider sent when all replicas received the message |FAST\nYou can compress and batch messages on producer before sendig to broker.\n\nIt gives high throughput and lowers disk usage but raises CPU usage.\n\nAvro Serializer/ Deserializer\n\nIf you use Avro as the serializer/ deserializer instead of normal JSON, you will have to declare your schema upfront but this gives better performance and saves storage.\n\nConsumer\nPoll loop\n\nKafka consumer constantly polls data from the broker and it’s no the other way round.\n\nYou can configure partition assignment strategy\n\nRange: Consumer gets consecutive partitions\nRound Robin: Self-explanatory\nSticky: Tries to create minimum impact while rebalancing keeping most of the assignment as is\nCooperative sticky: Sticky but allows cooperative rebalancing\nBatch size\n\nWe can configure how many records and how much data is returned per poll call.\n\nCommit offset\n\nOn message read we can update the offset position for the consumer, this is called committing the offset. Auto commit can be enabled or the application can commit the offset explicitly. This can be done both synchronously and asynchronously.", "fontSize": 16, "fontFamily": 5, "textAlign": "left", "verticalAlign": "top", "containerId": null, "originalText": "Producer\nYou can send messages in 3 ways to Kafka.\n\nFire and forget\nSynchronous send\nAsynchronous send.\nAll of them have their performance vs consistency pitfalls.\n\nYou can configure characteristics of acknowledgment on the producer as well.\n\nACK 0: Don’t wait for an ack |FASTEST\nACK 1: Consider sent when leader broker received the message |FASTER\nACK All: Consider sent when all replicas received the message |FAST\nYou can compress and batch messages on producer before sendig to broker.\n\nIt gives high throughput and lowers disk usage but raises CPU usage.\n\nAvro Serializer/ Deserializer\n\nIf you use Avro as the serializer/ deserializer instead of normal JSON, you will have to declare your schema upfront but this gives better performance and saves storage.\n\nConsumer\nPoll loop\n\nKafka consumer constantly polls data from the broker and it’s no the other way round.\n\nYou can configure partition assignment strategy\n\nRange: Consumer gets consecutive partitions\nRound Robin: Self-explanatory\nSticky: Tries to create minimum impact while rebalancing keeping most of the assignment as is\nCooperative sticky: Sticky but allows cooperative rebalancing\nBatch size\n\nWe can configure how many records and how much data is returned per poll call.\n\nCommit offset\n\nOn message read we can update the offset position for the consumer, this is called committing the offset. Auto commit can be enabled or the application can commit the offset explicitly. This can be done both synchronously and asynchronously.", "autoResize": true, "lineHeight": 1.25 } ], "appState": { "gridSize": 20, "gridStep": 5, "gridModeEnabled": false, "viewBackgroundColor": "#ffffff" }, "files": {} }

DNS

Load Balancer