@mrs83/echo-dsrn-114m-finance

Echo-DSRN-114M-v0.1.2 on Finance Dataset

This directory conducts federated instruction tuning with a pretrained Echo-DSRN-114M model on a Finance dataset. We use Flower Datasets to download, partition and preprocess the dataset.

Flower's Simulation Engine is used to simulate the LLM fine-tuning process in federated way, which allows users to perform the training on a single GPU.

ethicalabs/Echo-DSRN-114M-v0.1.2

🏗️ Architecture Details

Property	Value
Model Type	echo_dsrn
Layers	8
Hidden Dim	512
Attention Heads	4
MLP Ratio	8.0
Vocab Size	32011
Hybrid Attention	True
RMSNorm	True

📊 Parameter Breakdown

Component	Parameters	% of Total
Total	114.69M (114,687,488)	100%
Embeddings	16.39M	14.29%
DSRN Blocks (Aggregate)	81.91M	71.42%
LM Head	16.39M	14.29%

🧩 Internal Block Structure (Per Layer)

Sub-Component	Parameters	Description
MLP (Feed-Forward)	4.20M	Upscaled hidden layers
DSRN Slow State	3.15M	Constant-time memory gates
GRU Fast State	1.58M	Recurrent fast path
Surprise Gating	264,192	Dynamic focus mechanism
Normalization	1,024	LayerNorm / RMSNorm

Environments setup

Project dependencies are defined in pyproject.toml. Install them in an activated Python environment with:

pip install -e .

To run this on AMD ROCm, install with:

pip install -e ".[rocm]" --extra-index-url https://download.pytorch.org/whl/rocm7.1/

Experimental setup

The dataset is divided into 10 partitions in an IID fashion, a partition is assigned to each ClientApp. We randomly sample a fraction (0.4) of the total nodes to participate in each round, for a total of 50 rounds. All the Flower App settings are defined in pyproject.toml.

This app is designed to run with 10 virtual SuperNodes which have GPU-enabled ClientApp execution. First we need to change the configuration of the Simulation Runtime (which by default uses 10 nodes and only CPU). This guide assumes your default SuperLink connection points to one ready for simulations. If you aren't sure, please refer to the How-to run Flower locally guide.

flwr federation simulation-config \
    --num-supernodes=10 \
    --client-resources-num-cpus=6 \
    --client-resources-num-gpus=1.0

Running the experiment

Run the challenge with default config values. The configs are defined in [tool.flwr.app.config] entry of pyproject.toml, and are loaded automatically.

flwr run --stream

Experiment results

INFO :      aggregate_fit: received 4 results and 0 failures
INFO :      Communication budget: used 55721.44 MB (+557.21 MB this round) / 200,000 MB
Loading weights: 100%|██████████| 139/139 [00:00<00:00, 3699.17it/s, Materializing param=model.final_norm.weight]            
INFO :      fit progress: (50, 0.0, {}, 894.1292458820001)
INFO :      configure_evaluate: no clients selected, skipping evaluation
INFO :      
INFO :      [SUMMARY]
INFO :      Run finished 50 round(s) in 894.13s
INFO :      	History (loss, centralized):
INFO :      		round 0: 0.0
INFO :      		round 1: 0.0
INFO :      		round 2: 0.0
INFO :      		round 3: 0.0
INFO :      		round 4: 0.0
INFO :      		round 5: 0.0
INFO :      		round 6: 0.0
INFO :      		round 7: 0.0
INFO :      		round 8: 0.0
INFO :      		round 9: 0.0
INFO :      		round 10: 0.0
INFO :      		round 11: 0.0
INFO :      		round 12: 0.0
INFO :      		round 13: 0.0
INFO :      		round 14: 0.0
INFO :      		round 15: 0.0
INFO :      		round 16: 0.0
INFO :      		round 17: 0.0
INFO :      		round 18: 0.0
INFO :      		round 19: 0.0
INFO :      		round 20: 0.0
INFO :      		round 21: 0.0
INFO :      		round 22: 0.0
INFO :      		round 23: 0.0
INFO :      		round 24: 0.0
INFO :      		round 25: 0.0
INFO :      		round 26: 0.0
INFO :      		round 27: 0.0
INFO :      		round 28: 0.0
INFO :      		round 29: 0.0
INFO :      		round 30: 0.0
INFO :      		round 31: 0.0
INFO :      		round 32: 0.0
INFO :      		round 33: 0.0
INFO :      		round 34: 0.0
INFO :      		round 35: 0.0
INFO :      		round 36: 0.0
INFO :      		round 37: 0.0
INFO :      		round 38: 0.0
INFO :      		round 39: 0.0
INFO :      		round 40: 0.0
INFO :      		round 41: 0.0
INFO :      		round 42: 0.0
INFO :      		round 43: 0.0
INFO :      		round 44: 0.0
INFO :      		round 45: 0.0
INFO :      		round 46: 0.0
INFO :      		round 47: 0.0
INFO :      		round 48: 0.0
INFO :      		round 49: 0.0
INFO :      		round 50: 0.0
INFO :      	History (metrics, distributed, fit):
INFO :      	{'entropy': [(1, 3.1537166031646793),
INFO :      	             (2, 3.09459967930061),
INFO :      	             (3, 3.047732968983164),
INFO :      	             (4, 3.0271469014314363),
INFO :      	             (5, 3.0261276602745055),
INFO :      	             (6, 2.9603950560092924),
INFO :      	             (7, 3.0176863543174433),
INFO :      	             (8, 2.9284382580965516),
INFO :      	             (9, 2.8367491270820446),
INFO :      	             (10, 2.8764218747615815),
INFO :      	             (11, 2.835272414447733),
INFO :      	             (12, 2.7497260510921477),
INFO :      	             (13, 2.7568057596683504),
INFO :      	             (14, 2.75446877588557),
INFO :      	             (15, 2.7656791712174202),
INFO :      	             (16, 2.775463658571243),
INFO :      	             (17, 2.690022764738512),
INFO :      	             (18, 2.7505733907222747),
INFO :      	             (19, 2.734186365134769),
INFO :      	             (20, 2.7411312047861234),
INFO :      	             (21, 2.7991202581250616),
INFO :      	             (22, 2.722627188692867),
INFO :      	             (23, 2.7549223840236663),
INFO :      	             (24, 2.724032837152481),
INFO :      	             (25, 2.6940028965473175),
INFO :      	             (26, 2.6525548524883584),
INFO :      	             (27, 2.689897245168686),
INFO :      	             (28, 2.694835585355759),
INFO :      	             (29, 2.6846344113349914),
INFO :      	             (30, 2.690022542291953),
INFO :      	             (31, 2.676495109298825),
INFO :      	             (32, 2.6272316581086708),
INFO :      	             (33, 2.6007019428871465),
INFO :      	             (34, 2.6372929824582956),
INFO :      	             (35, 2.606489239966782),
INFO :      	             (36, 2.640555214881897),
INFO :      	             (37, 2.6189064415901235),
INFO :      	             (38, 2.611790724627349),
INFO :      	             (39, 2.6379489508022678),
INFO :      	             (40, 2.6657013259158706),
INFO :      	             (41, 2.590552067445324),
INFO :      	             (42, 2.648813943949867),
INFO :      	             (43, 2.6547619861084284),
INFO :      	             (44, 2.656480408268403),
INFO :      	             (45, 2.6205099165189227),
INFO :      	             (46, 2.6725728034973146),
INFO :      	             (47, 2.5895439392835042),
INFO :      	             (48, 2.6552754379498102),
INFO :      	             (49, 2.6723885743728024),
INFO :      	             (50, 2.629795551300049)],
INFO :      	 'mean_token_accuracy': [(1, 0.5562231012303486),
INFO :      	                         (2, 0.7136742029649992),
INFO :      	                         (3, 0.7282216949914201),
INFO :      	                         (4, 0.7746701105218957),
INFO :      	                         (5, 0.7815696507692337),
INFO :      	                         (6, 0.7987740755081176),
INFO :      	                         (7, 0.8047565344796045),
INFO :      	                         (8, 0.8436755715261566),
INFO :      	                         (9, 0.8680903527660889),
INFO :      	                         (10, 0.833289985358715),
INFO :      	                         (11, 0.8851204098961533),
INFO :      	                         (12, 0.9262883424758911),
INFO :      	                         (13, 0.8880675122141838),
INFO :      	                         (14, 0.926723267136209),
INFO :      	                         (15, 0.9040267877074244),
INFO :      	                         (16, 0.8979744702577591),
INFO :      	                         (17, 0.9160661123653686),
INFO :      	                         (18, 0.9485134035348892),
INFO :      	                         (19, 0.9427994866912913),
INFO :      	                         (20, 0.9417864364076411),
INFO :      	                         (21, 0.9263914010990113),
INFO :      	                         (22, 0.9530252054121637),
INFO :      	                         (23, 0.9712428271770477),
INFO :      	                         (24, 0.9776062667369843),
INFO :      	                         (25, 0.9710160821676255),
INFO :      	                         (26, 0.9715865255180481),
INFO :      	                         (27, 0.9701469495892525),
INFO :      	                         (28, 0.9614542603492737),
INFO :      	                         (29, 0.9717181101441383),
INFO :      	                         (30, 0.9762815252714581),
INFO :      	                         (31, 0.9760966698498164),
INFO :      	                         (32, 0.9795183718445176),
INFO :      	                         (33, 0.9787375739204189),
INFO :      	                         (34, 0.9818345720144386),
INFO :      	                         (35, 0.98637633643014),
INFO :      	                         (36, 0.9783522680401802),
INFO :      	                         (37, 0.9860417125660539),
INFO :      	                         (38, 0.9915313018865177),
INFO :      	                         (39, 0.9735496829376229),
INFO :      	                         (40, 0.9869524314500271),
INFO :      	                         (41, 0.9859556200563774),
INFO :      	                         (42, 0.9860346473165849),
INFO :      	                         (43, 0.9878364399981446),
INFO :      	                         (44, 0.9897713079300811),
INFO :      	                         (45, 0.9881182716415218),
INFO :      	                         (46, 0.9912083655595779),
INFO :      	                         (47, 0.9890078300755695),
INFO :      	                         (48, 0.9878364399981446),
INFO :      	                         (49, 0.9879095382441833),
INFO :      	                         (50, 0.9833552643656731)],
INFO :      	 'train_loss': [(1, 2.118514348536157),
INFO :      	                (2, 0.823670981065632),
INFO :      	                (3, 0.7135289078286301),
INFO :      	                (4, 0.5503281623191066),
INFO :      	                (5, 0.5750757485628128),
INFO :      	                (6, 0.5377023687586189),
INFO :      	                (7, 0.4579469152870647),
INFO :      	                (8, 0.39061935628235134),
INFO :      	                (9, 0.3411383502207817),
INFO :      	                (10, 0.456904406961985),
INFO :      	                (11, 0.3308007910765901),
INFO :      	                (12, 0.22608101771911604),
INFO :      	                (13, 0.4269442563207122),
INFO :      	                (14, 0.21218086286555835),
INFO :      	                (15, 0.36772751153448463),
INFO :      	                (16, 0.3447273552523257),
INFO :      	                (17, 0.29535657221225814),
INFO :      	                (18, 0.256374451245938),
INFO :      	                (19, 0.2782393436079909),
INFO :      	                (20, 0.23193878578749563),
INFO :      	                (21, 0.27579879837046783),
INFO :      	                (22, 0.1978169633950282),
INFO :      	                (23, 0.12385466246051921),
INFO :      	                (24, 0.0952083176030669),
INFO :      	                (25, 0.17468968942767446),
INFO :      	                (26, 0.1708230679997079),
INFO :      	                (27, 0.09796904004698029),
INFO :      	                (28, 0.2111228520960781),
INFO :      	                (29, 0.14141523211168502),
INFO :      	                (30, 0.15123780636023687),
INFO :      	                (31, 0.11536264889033597),
INFO :      	                (32, 0.138783653717287),
INFO :      	                (33, 0.09142257179886625),
INFO :      	                (34, 0.10622006885087215),
INFO :      	                (35, 0.06972711510215729),
INFO :      	                (36, 0.10574966628896824),
INFO :      	                (37, 0.06575150841504608),
INFO :      	                (38, 0.04898714312978455),
INFO :      	                (39, 0.16180810307311808),
INFO :      	                (40, 0.06623421963227294),
INFO :      	                (41, 0.07207394609609163),
INFO :      	                (42, 0.06387270625194562),
INFO :      	                (43, 0.07361841263636124),
INFO :      	                (44, 0.05762298128963736),
INFO :      	                (45, 0.044250426808322925),
INFO :      	                (46, 0.05656750529568705),
INFO :      	                (47, 0.04254968277835036),
INFO :      	                (48, 0.06769642132567098),
INFO :      	                (49, 0.05810665879223432),
INFO :      	                (50, 0.09253640343820142)]}

Model saving

The global PEFT model checkpoints and evaluation results are saved after aggregation on the server side. By default, the output directory is ~/.ethicalabs/flwr/results. You can configure this path using the results-dir key under the [tool.flwr.app.config] entry in pyproject.toml. The checkpoint saving frequency is specified with train.save-every-round in the same config section.

Evaluation

To evaluate the best federated checkpoint (e.g., round 38) across the financial benchmarks (FPB, FIQA, TFNS), use the integrated evaluation script:

python echo_finance/eval.py \
    --base-model-name-path ethicalabs/Echo-DSRN-114M-v0.1.2 \
    --peft-path ~/.ethicalabs/flwr/results/[RUN_TIMESTAMP]/peft_38 \
    --datasets fpb,fiqa,tfns \
    --apply-chat-template

Note: The --apply-chat-template flag ensures the model is evaluated accurately using the structural ChatML format it was aligned on.

Benchmarks

Dataset	Sentiment Accuracy
FPB (Financial PhraseBank)	71.45%
TFNS (Twitter Financial News Sentiment)	70.14%
FIQA (Financial QA)	63.82%

Checkpoint Selection: These results are recorded from Round 38, which was selected based on optimal training convergence (mean token accuracy $\approx$ 99.15%) and the strongest observed performance on the FPB benchmark.

Model Usage

You can load the fine-tuned PEFT adapter over the base model and run sentiment analysis inference as follows:

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the base model and tokenizer
base_model_name = "ethicalabs/Echo-DSRN-114M-v0.1.2"
base_model = AutoModelForCausalLM.from_pretrained(base_model_name, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)

# Load the fine-tuned PEFT adapter
peft_model_name = "ethicalabs/FlowerTune-Echo-DSRN-114M-Finance-PEFT"
model = PeftModel.from_pretrained(base_model, peft_model_name, trust_remote_code=True)

# Prepare the prompt for sentiment analysis
instruction = "What is the sentiment of this news? Please choose an answer from {negative/neutral/positive}."
news_text = "The company reported a 20% increase in revenue for the third quarter, driven by strong sales in the tech division."

messages = [
    {"role": "system", "content": "You are a helpful financial assistant."},
    {"role": "user", "content": f"{instruction} {news_text}"}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Tokenize and generate
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)

print(response.split("<|assistant|>")[1].replace("<|end|>", "").strip() if "<|assistant|>" in response else response)

Flower App by ethicalabs.ai - AI/ML research and development - HuggingFace