Hi everyone,
I’m a PhD student in Computer Science researching why people choose to self-host software—what motivates you, what concerns you, and what factors affect your decision-making.
To better understand this, I’ve prepared a short anonymous survey (~10 minutes). Your insights as part of the self-hosting community would be incredibly valuable for this research.
🔗 Survey link: https://survey.lpt.feri.um.si/376953?newtest=Y&lang=en&s=ls
This study is part of my doctoral research at the University of Maribor, Slovenia, conducted under the supervision of Assist. Prof. Lili Nemec Zlatolas, PhD. All responses are anonymous and used strictly for academic purposes.
If you’ve ever self-hosted anything—or even just considered it—I’d really appreciate your input.
Thanks a lot for your time, and feel free to ask me anything about the project (luka.hrgarek@um.si)!
Cheers!
I submitted a response but if i may give some feedback, the second portion brings up:
I am willing to pay a substantial amount for hardware required for self-hosting.
This seemed out of place because there were no other value related questions (iirc). Such as:
- I believe self hosting saves me money in the short term
- i believe self hosting saves me money in the long run
I’m sure you could also think of more. But i think it’s pretty important because between cloud service providers and any non-free apps you want to use, it can be quite costly compared to the cost of some hardware and time it takes to set things up.
The rest of my responses don’t change but if you’re wanting to understand the impact of money in all of this, i think some more questions are needed
Best of luck!
I believe self hosting saves me money in the short term
i believe self hosting saves me money in the long runI can add to the voices here that have this as one big consideration. With some second-hand hardware, it’s very cheap to set up almost unlimited cloud space for personal use.
That, and a lot of questions about ease of use too, but I answered them neutral because some are bears to set up, others are one click. Idk it depends.
And unless you are expecting significant traffic, you can use an old Core2Quad with 2GB RAM and it will work just fine.
Not to mention that a lot of self-hosting can be done on hardware you already had laying around.
And I self-host precisely because of the money I save using surplussed hardware. I have a symmetrical 1Gb SOHO fibre connection from my ISP, so I can host whatever the hell I want, I just need to stand it up. And a beefy older system with oodles of RAM is perfect for spinning up VMs of various platforms for various tasks. This saves me craploads of money over even a single VM on cloud platforms like Vultr. Plus, even if I were to support a “heavy” service sufficiently in demand to warrant its own iron, it still costs me less than a year’s worth of hosting to obtain a decent platform for that service to run on all by it’s lonesome.
My only cloud costs end up being those services which are distributed for redundancy and geographical distance, such as DNS and caching CDNs.
My only quibble would be to swap “pay” for “invest” which captures both the dynamic of up front expense and expected savings from ending recurring subscription fees. That’s how I look at it. Every penny I put into my own digital sovereignty is an investment that will yield returns both financial and otherwise.
I saved money by stea- I mean borrowing - work equipment
Second this - so far it has cost me money, but as I am able to cancel more subscription services, the savings will add up.
People who influence my behavior think that I should use cloud services.
This question is going to get bad data. No one likes to think of themselves as being influenced. A more effective phrasing would be “…people I trust…”
Thanks for the comment — that’s a valid observation, and I understand how the wording might feel a bit awkward.
Just to clarify: the statement comes from a standardized construct called Subjective Norms, and follows the phrasing from the paper “A Theoretical Extension of the Technology Acceptance Model” by Venkatesh & Davis (2000).
For all independent variables in the survey, we relied on validated scales and established practices from prior scientific research, to ensure consistency and reliability. That said, I really appreciate your feedback. :)
I’m old enough to consider the framing of the question to be weirdly loaded.
It does not feel that long ago where people would be asked to justify entrusting their product’s functions and data to a bunch of strangers who can make unilateral decisions about your service with zero comeback. Now we’re being asked to justify not doing that.
Thank you for your comment. The use of similar statements is a common practice in survey research, as it helps to capture various dimensions of a construct more reliably and provides a clearer understanding of individual perspectives.
Regarding your concern, the purpose of this study is not to ask anyone to justify or defend their choices, whether it’s about using third-party services or self-hosting. Instead, we aim to identify the factors that influence such decisions, from a scientific standpoint, to better understand the motivations behind them. The goal is not to judge whether one choice is better than another, but to gain insights into the different considerations that shape people’s decisions when it comes to managing their data and services. Thank you again for taking the time to complete the survey.
Sure, I’m just bemoaning the fact that you’ve taken cloud hosting to be the default. It’s as much a complaint about the world in general as anything specific to you. Good luck with it all.
Totally understand your concern, and you’re right, the assumption of cloud as a default can be frustrating in many ways.
That said, this framing partly reflects the state of the academic literature: in the past 10–15 years, cloud adoption (especially SaaS) has been extensively studied, to the point where it often feels “default” in research too. In contrast, self-hosting has been far less explored, which is exactly why we’re doing this study—to help fill that gap and highlight its relevance, especially in academic contexts.
Thanks again for your thoughts and for the good wishes! :)
I use self-hosted services in the following categories as much as possible…
That question could really use a “not applicable” option. I don’t operate any home automation solutions, so any answer from me would be invalid, and neutral answers because the item is not relevant will appear the same as neutral answers because I use both self-hosted and externally hosted solutions (e.g. Mullvad for privacy and Tailscale to get around CGNAT).
Thanks for the comment: that’s a really good point to raise.
Just to clarify: the statement “I use self-hosted services in the following categories as much as possible” is meant to reflect how fully you make use of self-hosted solutions in each area. A response like “Strongly agree” would indicate that you actively use and take full advantage of self-hosting in that category.
If you don’t use solutions in a particular category at all — whether that’s because you don’t need them, aren’t interested, or use only external services — then it’s completely appropriate to select a disagreeing option (e.g. “Disagree” or “Strongly disagree”). In this context, lower agreement simply indicates low or no use, regardless of the reason.
From a methodological standpoint, the data will be analyzed using structural equation modeling (SEM). This approach requires a complete set of responses across the measured constructs. If we included a “not applicable” option, it would create missing values in the dataset and potentially lead to excluding the entire response for that part of the analysis — which would significantly reduce the usable sample size.
That said, I really appreciate your feedback! :)
Be prepared for some respondents to choose the middle option as a proxy for “not applicable,” because that’s what I did.
I get why you’re taking that approach but you risk serious misclassification bias. The replies have stated people are using both “disagree” and “neither agree nor disagree” to indicate they are not hosting a particular kind of service. From your description of your research it sounds like disagree and strongly disagree should indicate that the individual uses company hosted services instead of self hosted services for those domains. The relationship between views on privacy and types of services self hosted is going to be confounded by that.
Yeh, I took “don’t agree or disagree” to be the N/A.
It seemed the most neutral.
I don’t really use anything for bookmark sharing/management. So I don’t strongly disagree or strongly agree with self hosting it.If this was the expectation, then it should have been a checklist and/or had N/A available. I don’t think your data for this section will be accurate since myself/others replying did not use it this way.
I used “disagree” as “I am using a non-self hosted service for this,” the middle as “N/A” and the agree as “I am hosting a service for this”…
i smoked some good weed like half an hour ago, do i need to wait
I have answered, and had to put “Other” in employment status because I am self employed. An option for self employment would have been useful in my opinion!
Thank you for your feedback! You’re right, self-employment could be listed more clearly, but choosing “Other” was absolutely fine and your response is fully valid. Thanks again!
Have you thought about contacting Louis Rossmann? He created an extensive video guide on how to self host using FOSS. Perhaps he’d be willing to highlight your survey to his over 2 million subscribers.
I self host for the same reason I’m not clicking some random link: distrust lol
um.si is for University of Maribor in Maribor, Slovenia. It looks legit.
As a fellow comp sci graduate (different uni, long time ago) that doesn’t fill me with a ton of confidence lol.
This could actually be a study on phishing lol
Or a less ethical study on virus propagation
Or maybe he has just gone rogue and his university hasn’t noticed because they probably don’t actually monitor what students are hosting very closely, as long as it’s not causing problematic network traffic.\I doubt it, but I’m still not gonna click a link that someone is asking me to click lol
If you’re that vulnerable to shady URLs, you may want to rework your blockers or even spin up a VM. If you’re that venerable you phishing, just don’t give them your numbers.
Why risk it?
Can’t limit yourself to twitter, facebook and google. Hell, I think those sites are the riskier ones.
Those are known evils I know how to deal with, that are popular enough that there are foss tools to handle them.
But I’m being a bit hyperbolic here. I’m not as paranoid as I’m portraying myself… I just don’t have any motivation to click this particular risky link.
If I have a file, I have it.
If google has my file, they say they have it. I’m told it’s there. For how long? I dunno. Private? Hell no. Forever? Likely not.
This small discrepancy is the entire drive behind me selfhosting.
I’m a minimalist with selfhosting, a raspberrypi with a vpn connection, syncthing and a samba share is all most anyone really need-needs.
Also there’s that a file on a cloud service might change. E.g. Amazon sometimes updates ebook covers to advertise that there’s a show - even for those who have paid extra to have the ad-free option.
E.g. the sticker-type graphic on this and that the title is updated to “The Fires Of Heaven: Book 5 of the Wheel of Time (Now a major TV series)”:
I downloaded music I bought online and copied it to my Google drive once. This was years back, mid 2010s, this album just came out for my favorite artist back then. I’d downloaded it back to another pc and a week later - poof. No more mp3s. @.@
Edit: just that folder of that album’s mp3s, not my whole music library back then, just to be clear. Still, that was my first big burn from cloud services.
Can you elaborate? I keep a copy of my favorite music in Google drive just to download in different PCs or to keep a backup.
It was the Muse album Drones that I bought from the band’s website. I thiiiink I shared a link to the folder to my netbook with a different account to download to, and then I didn’t notice til later (maybe it was a week? A month?) the folder of just this muse album was empty.
Idr checking the trash can for my Google drive or anything. I don’t think I got a notice because I searched “copyright” in my emails circa 2015, and nothing related to removing my files popped up.
Hmm. The first section about cloud service providers is a bit weird to me. There are providers which “keep my best interests in mind” as part of their business model, backblaze would be one. Their whole idea is to provide a good backup services. Encrypting my data before transit also doesn’t make me worried that it will be accessed by them or any of their employees because they will only get some garbled mess.
Compare that to google, another cloud service provider. Their business model is to make money by selling me ads (foremost), they do that by gathering as much data as possible. Here all my answers would be negative.
This puts me in an awkward spot where I nearly every time answer with “Neither agree nor disagree”, because there is more to it and not because I don’t have an opinion.
Thank you very much for your thoughtful feedback!
You’ve raised an important point: cloud service providers are not all the same, and their business models can significantly influence how much trust users place in them. We fully agree that there’s a big difference between providers like Backblaze, whose value proposition is built on privacy and reliability, and companies like Google, where monetization often relies on extensive data collection.
The purpose of this section in the survey is to explore general perceptions and motivations behind, not to evaluate individual providers. However, we understand that this generalization can be limiting — especially for respondents who distinguish clearly between different types of services and trust models. Your situation, where you answered “Neither agree nor disagree” not out of indecision but due to the complexity of the issue, is very insightful.
Thanks again for taking the time to share this, it’s greatly appreciated!
Felt that too. Its throwing all providers in one bucket which makes it very hard to judge
Did my part! Good luck!
Thanks a lot, really appreciate it! :)
Because I dont need to pay rent for my files and I don’t have to worry about AI and VCs trying invade my privacy.
I hope you share the results when your thesis is done :)
Thanks so much – I definitely will! The results will be published in my PhD dissertation, and since publication in a scientific journal is a requirement for completing the degree, they’ll be shared there as well. I’ll make sure to post a link here once everything is available! :)
Hopefully you can publish in an open-access journal — if not it would be great if you could share an arXiv preprint :)
Absolutely, that’s our intention as well! Our university actively encourages publishing in open-access journals whenever possible, and I fully support that approach. So yes, if all goes well, the results will definitely be published open access. Thanks for the encouragement! :)
Best of luck!
Filled in the survey. A few notes:
- Some of my answers make no sense on the surface - like the “experiment with new technology” block (4 questions). I’ve answered “Agree” to all of them, because I have taken time into account, which is not represented on the questions. Long story short - I do love experimenting with new tech, I’m almost always the first one to try something among my peers, but at the same I never blindly jump in (I’m hesitant) as most of the “new technology” is just
- Someone repackaging foss and relabeling it
- Some LLM bullshit
- An inferior product to what already exists
There are also scenarios where I have already found something that’s the best solution for my case, so I won’t even bother looking at something new, even if it might be the best thing since sliced bread for someone else.
-
TIme and effort setting up/maintaining (4 questions). It doesn’t take much time nor effort to set anything up now, but it did when I was starting out initially. I knew very little and a bunch of concepts hadn’t clicked, yet, so it took me days to set up Nextcloud and about half a year (on and off. Probably a week or so if it were all squeezed together) for email.
-
The performance and intent to use in the future questions are weird - they feel like the same question, just leveling off in intensity. I’ve selected the same answer for all of them. They probably should’ve been a single question with agree/disagree options swapped for intensity levels.
Good luck with your PhD!
- Some of my answers make no sense on the surface - like the “experiment with new technology” block (4 questions). I’ve answered “Agree” to all of them, because I have taken time into account, which is not represented on the questions. Long story short - I do love experimenting with new tech, I’m almost always the first one to try something among my peers, but at the same I never blindly jump in (I’m hesitant) as most of the “new technology” is just
I did it for ya…good luck with your phd
Thank you so much, really appreciate it! :)