Linked lists in C and Java

Now that I finally have a bit of free time, I've been playing around with linked lists. They're one of the first and simplest data structures you learn, yet truly coming to terms with them is often students' first foray into dealing with pointers.

One thing that always bugged me was how we ought to handle linked lists differently in a language like C versus a memory-managed language like Java. In C, every node is another entity that must be malloc-ed and free-ed. In Java, you mostly don't need to worry about that.

In the basic questions you find online about linked lists, these aren't that big of a deal. I'm not going to delve into more complicated requirements, such as those for implementing heap memory itself. Instead, let's examine the more annoying parts of basic operations on linked lists, especially those asked in interview questions under time constraints.

These annoyances are due to a combination of the following needs:

  • operations that delete, insert (prepend and/or append) on singly-linked, doubly-linked, and/or circular linked lists
  • dealing with the head of the linked list and the case of an empty list
  • writing clean and elegant code

I've taken many of the following examples from Nick Parlante's Linked List Problems, which are part of Stanford's CS Library. The problems are all written using C.

Insertion in C vs. Java

One of the basic utility functions that Nick provides is a function called Push, whose signature is as follows:

void Push(struct node** headRef, int newData);

The purpose of this function is to take in the head of the linked list and add a new node in the front (the simplest way to add data to a linked list), which means we're making this new node the new head and connecting it to the old head, which now becomes the second node in the linked list.

What would this look like in Java? Or Python? It's a bit tricky to directly say what it ought to be based on the function signature in C - the reason is that you can't have a pointer to a pointer in either of these two memory-managed languages.

In C, having a reference parameter helps with a lot of functionality. Spot the problem in this definition of Push:

void Push(struct node* head, int newData) {
    struct node* newNode = malloc(sizeof(struct node));
    newNode->next = head;
    newNode->data = newData;
    head = newNode;
}

Ostensibly, Push is being called in some other function, such as main, in order to append to an existing linked list:

int main() {
    struct node* mainHead = CreateLinkedList123();
    Push(head, 0);
    // rest of function
}

But the head parameter in Push is not the same pointer as the local variable mainHead in the body of main. They point to the same place, namely, the first node of the linked list, but they are two different pointers. Once Push returns, head disappears from the stack, yet mainHead is completely unaffected, leaving us with an orphaned block of memory (the newNode we allocated).

To solve this, we have Push take in a reference parameter to the head pointer.

void Push(struct node** headRef, int newData) {
    struct node* newNode = malloc(sizeof(struct node));
    newNode->next = *headRef;
    newNode->data = newData;
    *headRef = newNode;
}

This is a common technique in C when you want to modify some value in another function. This now brings us to the question - how would you write something like Push in Python? Or Java? You can't manipulate pointers like you can in C, so you can't translate Push directly into Java unless you do something crazy like using AtomicReference. In cases like these, it's far better to adapt to the paradigm of the language itself, which, in Java's case, is object-oriented programming.

Instead of manipulating a naked Node data structure, we'd encapsulate it inside a LinkedList class. And we would never have a function called Push just floating around - it would be a method on LinkedList and it would probably look like the following:

class LinkedList {
    private Node head;
    void Push(int newData) {
        Node newNode = new Node(newData);
        newNode.setNext(head);
        head = newNode;
    }
}

And something similar would work in Python. Not that you can't do something similar in C, either - in fact, it's probably worthwhile to also interact with linked lists through a higher-level struct as well.