My review of granite-4.0-h-tiny

by algorithm - opened Oct 3

Discussion

algorithm

Oct 3

•

edited Oct 3

(this review was not written by AI)

Hi IBM,

Nice job on this granite 4.0 release, I had been looking forward to these. So far I have only tried this particular one (h-tiny) and am currently downloading "small" but I wanted to share my thoughts so far. I'll try to put it somewhat organized:

My use cases:

The field of medicine
General every-day topics

Positives:

It's good at going straight to the point when being asked a question. Some other LLM models will give a huge introduction on a topic first, but this is useless because if that were my question (for example: "give me an overview of the topic") then I would have asked that, but I didn't. This model is good at skipping that.

It has a decent amount of knowledge in the field of medicine for its size.

It's good at putting things in an organized list.

At the end of every answer, it does unfortunately give that disclaimer "please consult a healthcare professional" but it keeps this brief. So even though it would be even better to not have that at all, I do appreciate that it's not endless disclaimers like some other models do. This often really irritates me, so I'm glad it doesn't do that.

Some room for improvement:

While it's good at going straight to the point, I do have to say that the answers could be a bit longer. By that I don't mean the introduction or disclaimers (which are nice and short) but just the overal answers itself are a bit too brief. To be fair, this may be better in the small model but it's still downloading. For ideas on this you may want to look into projects such as LongWriter by Zhipu, here is their paper: (I'm not affiliated) https://arxiv.org/abs/2408.07055

This point may be the related to my previous one, but I've noticed that if you ask granite tiny a follow up question such as "What about ..?". That it will mention reasons why that is also a valid option. For example (I just made this up) if you ask it "What are popular colors" then let's say it outputs: "Red, Yellow, and Blue" and I follow up by saying: "What about green?" and it outputs: "That's also a popular color". This makes me wonder that if it really is, then why didn't it also include that in the first answer? Again, this might be related to same issue of answers being a bit too short, as I mentioned in my previous point.

My conclusion:

It's a nice model, especially for it's size, for my use-cases. It could however, benefit from giving longer answers.

anderbogia

Oct 11

Same experience here. Been giving it a small selection of dumb tools to send commands to a robot. Runs excellently on an 24GB M4 MBA with a lot more headroom to spare. Though I would say for my case, it would benefit from giving less, shorter answers. YMMV, but my personal use cases for tool-calling LLMs only require being able to translate natural language prompts into choosing tools and passing parameters into them.

gabegoodhart

IBM Granite org Oct 13

Thanks for sharing your thoughts! We love getting feedback of any sort and will take it into account for future models.

@algorithm out of curiosity, were you using any system prompting with your experimentation? For the 3.x series, we found that the need for default system prompting was a major barrier to adoption, so for the 4.x series, the team opted to avoid required system prompts as much as possible. The model should be very amenable to system prompting to help tailor the level of detail and style of response. If you do experiment further, please let us know how it goes!

algorithm

Oct 16

Thank you @anderbogia for sharing you experience too! That's actually a really cool use case (sending commands to a robot), I should read more about that type of thing.

@gabegoodhart no problem :) and thank you for letting me know about the system prompt. I remember you from several llama.cpp commits, going back maybe at least a year now? Time flies!
I actually did not use a system prompt, I see one was also added to this repo, I will give that shot and see if it improves.
Fortunately, it was already a decent model but I'm excited to see what happens when I add that!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment