Owncloud newbie needs help desperatly

help

#1

Hi everyone. I'm new to OwnCloud and facing some mind boggling issues with my setup. I'm not an IT person nor a sysadmin, but I do know what is a database, an Apache server, network encryption, etc and I'm very comfortable with the theory of an OwnCloud system.

Setup: 3 x Win7 clients + 1 x Mac client, synching data with a hosted server at TMDhost.com. Server side installed via Softaculous with automated scripts. This part works fine and files can upload/download and synch between clients and server, quickly.

1) after the predictable mess of users changing their mind about root directory names (we have 5 root folders and 5 distinct synch connections), I find that some folders on the server, and some files underneath cannot be deleted by anyone, even the admin account from the web interface on the server. There is no useful error message, just "error deleting XYZ".
I understand this must a db file lock issue discussed elsewhere. My problem is this affects over 1GB of files across several hundred sub-folders, so I'm quite worried about applying a brute force SQL script. I don't have the SQL skills to backup everything, test the script and check the logs, roll back in case of pb, etc. Especially if this is going to trigger mass syncing to 4 clients immediately after.
The Owncloud log file is 1MB long and filled with useless messages status messages such as:

=#=#=#=# Propagation starts 2017-03-11T11:40:56 (last step: 1845 msec, total: 1845 msec)

||01 - MM OSB/00-The Story/ECIP -> 01 - ECIP OSB/00-The Story/ECIP|INST_RENAME|Up|1488076049|invalid|0|00001576ocmrmrephzwi|3|Error transferring https://cerulean.asia/owncloud/remote.php/webdav/CCCloud/01 - MM OSB/00-The Story/ECIP - server replied: Locked|423|0|0|||INST_NONE|

=#=#=# Syncrun finished 2017-03-11T11:40:56 (last step: 629 msec, total: 2474 msec)

I say useless because there are so many different types of errors that I can't see any pattern that would direct me to what is wrong.

2) Compounding this problem is the other predictable mess of incompatible Mac filenames with Wn7 file system. Some of the Mac filenames end with spaces or contain forbidden characters. This causes (expected) sync failures with Win clients. Difficulty is that Owncloud "activity" tabs briefly show the errors..and then clears! By the end of a sync cycle, I see only 3-4 errors in the display window, not even related to file naming, when I know there are hundreds of file naming problems.
Trying to extract from the log the list of faulty filenames, hidden among the error lines above, is above my IT skills. I understand something like GREP might help, but I'm not a sysadmin and can't write the right commands. I'm also working remotely on a hosted basic box, so I have no idea where to start to get a command line interface...

3) Now is the interesting part. Until now I'd advise RTFM...
But here is a 3rd problem, really unexpected: one user had the great idea to rename a directory at level 2, just below the sync root directory and above a 1.2 GB large tree. The directory rename did NOT sync to any client, but it did cause a duplicate tree on the server side. This duplicate tree (old tree + new tree) has partially synced to all clients..and stopped. I can see on my Win 7 system 2 x directories, one old, one new. I can see the same on the server. I can also see the same in the OC client window. It has somewhat started to sync to other clients as well.
BUT, the directory sizes and contents are WRONG. On my client Win 7 machine:
- TreeSize shows Dir A = 1.2 GB and Dir B = 667 MB
- OC on the same machine shows Dir A = 863 MB and Dir B = 1 GB
- OC server (web interface) shows the same as OC client (Dir A = 863M / Dir B - 1 GB)
- Win Explorer on the client matches TreeSize
=> This is not a pb of decimal points, counting KB versus MB. The counts are really far off
=> Directory trees which have synced correctly are all good and coherent everywhere. Only this huge tree is completely inconsistent.
=> checking things manually, OC is failing to sync 337M from Client Dir A up from client PC to the server, and failing to sync 333M from Server Dir B down to client Dir A.
Looking only at directory sizes, it seems the pointers between Dir A and Dir B are crossed somewhere, which would explain the odd symmetry and failure to sync properly (337 = almost 333, difference could come from Linux vs Win and KB/MB differences)
Looking at directory contents gives no clue. The contents are in synch from filesystem point of view, but the OC client and server counts are completely off (the OC client is in fact reporting what is on the server - not what is on the client filesystem. I've tried quitting OC and restarting it to make sure it reads a fresh copy from the disk, but no change)
=> The only explanation I can see is that 1) windows/treesize are definitely correct and 2) OC client and server are completely wrong about the size of the folders that are being synced. They must be counting from an internal file list (in OC db?), which is wrong, both in file count and in file size. I'm in a real mess!
The good news is that if OC doesn't even have a clear picture of what it is trying to sync, then I don't need to worry about my first two problems!

Conclusion:
1) how is possible that OC is such an unstable app? We have not played with any config parameters or tweaked anything. It's just out of the box from Softaculous, and up to date everywhere. I was expecting a long time to stabilize a system-wide synch with possible duplicates, and a clean-up period afterwards. Instead it feels like I'm alpha-testing...
2) I'm not sure where to go from here. I can't tell if any user has full copy of everything (aprox. 8GB, 55k files, cannot check manually) that we could use as master copy, wiping everything and starting from scratch with that copy.
I'm also worried that trying to fix in a sequential approach will trigger further sync and complicate the mess beyond repair.

If anyone has any great ideas on where to start, I would be really grateful, otherwise I'm going to have to consider moving to Google Drive or similar, which is NOT what I wish to do.

Thank you

PS: The server side admin account shows 2 setup errors which I thought were trivial, but in case it makes a difference:
Security & setup warnings:
- The "Strict-Transport-Security" HTTP header is not configured to at least "15552000" seconds. For enhanced security we recommend enabling HSTS as described in our security tips.
- No memory cache has been configured. To enhance your performance please configure a memcache if available. Further information can be found in our documentation.


#2

Small typo above:
=> checking things manually, OC is failing to sync 337M from Client Dir A up to server Client Dir A, and failing to sync 333M from Server Dir B down to client Dir B.


#3

Forgot something important. I realize the alpha-testing comment may offend the hard working volunteers behind OC.

My frustration comes from the fact that the OC client reports NO ERROR. All the syncs on OC client appear with a green tick marks "all good" and sync activity cycling quietly without changes.

The server displays no clue either, that it has major sync problem that it can't resolve.

If I was not a stubborn, nit-picking guy checking carefully the details, we would be in the process of destroying 8GB of original master data.

I'm happy to cooperate with any developer who wants to spend time with the logs, or do some tests on why OC does not see any it has a problem.


#4

Hi,

it would really help if you fill out the issue template shown when creating a new thread. This is asking you various information like used ownCloud version which is currently completely missing here.

There is a FAQ about that available at [1]. Configuring redis for file locking is especially important if you have a lot of users.

[1]


#5

Thanks. Not sure how I missed the issue template - my bad, apologies.

Steps to reproduce
Very difficult to reproduce. Put together :
3 x Win7 users
1 x Mac user,
4 x big data archive with incompatible filenames (Win/Mac),
1 x power user with the great idea to use Owncloud
2 x vaguely computer-literate end-users
Shake and stir :slight_smile:
I can fix most data problems with slow manual checking, but I can't 1) delete server files that are stuck, 2) get the clients to display the same directory sizes as what the os reports, 3) get level 2 directory name changes to replicate without creating duplicate trees (and I don't dare try that again)

Expected behaviour
- Everyone uploads their archive
- Everyone syncs back the other uploaded archives
- Admin sees a log of clashing directories and filenames which are either not synced, or renamed with (1), (2), etc
- Users see a log of clashing directories and filenames which are NOT synced (because OC doesn't have an option to decide if server or client should have priority in case of conflict => this would be a nice feature)
- Each user can save the log of errors, and (nice to have), email a copy to OC admin

Actual behaviour
Most data is uploaded, but not all
Most data is synced down to other users, but not all
Mac odd filenames are uploaded to server, but never downloaded to Win clients. An error appears in the client log, briefly...and disappears by the time the power user remotes into end-user system to see what is happening
Some serious conflicts are NOT detected and do not produce a log of "data errors only" at user level. The client reports green sync status when in fact it's glowing red.
Directory renaming during a big sync cause a mess for everyone, including duplicates directory trees
Some server files cannot be deleted, without any explanation why, and no option to reset a db lock with a human readable interface
In case of sync corruption, client and server do not report correct directory sizes as seen by Operating System.
There is a 3 line log of errors visible on OC web admin page, and the ability to download a web admin 71 MB (!) log file

Client configuration
Client version: Version 2.3.0 (build 6780)
Client operating system: Win 7 / 64 Ultimate SP1

Server configuration
Operating system: hosted Linux box from TMDhost.com
Web server: Apache
Database: MySQL. Neither OC config report nor Cpanel can tell me which version
PHP version: 5.6
ownCloud version : 9.1.4.2
Updated from an older ownCloud or fresh install: fresh install, but hosting admins did something to resolve a server resource issue when everything was going wrong t one point. See below for details.
Special configurations (external storage, external authentication, reverse proxy, server-side-encryption): not that I'm aware, If there is it is transparent to me.

ownCloud log (data/owncloud.log)
Log is 71MB with only options logging "warnings, errors and fatal issues". I can upload to a file sharing system if needed, or perhaps better to reset the log at this stage, and let it fill up with a small amount of errors?

additional info
While searching through FAQs about the stuck db file locks, I noted some discussions about the CRON job type.
My (default) setting was AJAX, and it didn't seem to work well since I had many stuck files. Server was running out of resources the first time we started uploading data and making changes to root directories. Hosting admins made some changes in the back-end, and server is running normally now.
I changed OC/CRON setting in webadmin to WEBCRON (no one is using web interface except admin - me). Looking at the setting again right now, it says "Last cron job execution: 12 days ago. Something seems wrong" even though small print states it should trigger every 15 monites.
I'm going to change the setting to straight "CRON" to see if it helps.

Thanks to anyone who can provide suggestions. Will check now the last post.


#6

Good news, the FAQ "how to remove file locks" has solved 90% of my issues.

I had seen but not tried this, since I am on version 9.1.4 and it was supposed to be have been fixed.
It's a bit scary to purge entire directories trees, but it's working.

I've had to delete locks twice in 10 minutes, so something is still wrong. I'll let the system stabilize with clean data on all clients and server before I report further progress.

Thank you


#7

Hi,

just changing those to "Cron" or "Webcron" is not sufficient. You need to do additional steps like calling the cron.php from an external tool for Webron or configure a system cron like explained in the documentation [1]. But the redis based file locking configuration like explained in [2] is also something you really need to tackle / configure.

[1] https://doc.owncloud.org/server/latest/admin_manual/configuration_server/background_jobs_configuration.html

[2] https://doc.owncloud.org/server/latest/admin_manual/configuration_files/files_locking_transactional.html


#8

Locked files can happen at any time and they are cleaned up automatically after some time. But if you had configured AJAX then this only happens rarely and only if a user with locked files is browsing the WebGUI of ownCloud.

So for such a bigger setup Cron background jobs and redis file locking like explained in my previous post is an absolute must.


#9

Ouch, thanks for the help and patience.
Playing with cron.php or redis is above my level of expertise. I'll ask the hosting admins for help.


#10

Ok. Just some additional notes:

The sync client developers have dediced against this and the server will "always win". However there seems to be some discussion ongoing at [1] to also sync conflict files to the server.

This is already possible like explained in the documentation at [2].

[1]

[2] https://doc.owncloud.org/desktop/2.3/troubleshooting.html#log-files


#12

Thank you for the advice.

In case this happens to someone else, fixing the bad file locks really helped to unravel the mess. Start with that before anything else.

Is there a good reason why the DB file locking/clearing cannot be made more robust during the initial install? I imagine there are many amateurs like me, who can follow simple instructions but don't know how to dig deep under the hood of a Linux server without breaking it :slight_smile:

PS: Thread can be closed. We're 95% done fixing the data archive, the balance of 5% being either incompatible filenames, duplicate files/directories, and a few really odd ones (eg all the clients report that "Cerulean/~$1-stop Shop in Myanmar.pptx" is not synced because "File is listed on the ignore list" (~ prefix of temp windows MS office file) - but there is no such files in any folder or in the server. I have "view hidden files" switched on both in Win Explorer and on OC Web, and the file doesn't exist). This is very minor, don't bother finding a fix


#13

ownCloud is designed to be as simple as possible. The redis file locking as well as the systems cron background job needs either:

  • access to the base system which people don't have (e.g. shared hoster)
  • knowledge about the operating system (which people often also don't have)

For this reason you can't enable these by default as they just won't work "out of the box".

In the past there where also some notices about both items in the admin backend to make people aware of the additional configuration possibilities. But people where upset about that (as quite often) and where nagging about the messages they couldn't fix on their own.

Normally these two things are also not an absolute must for most setups out there. Only setups like yours with more people need that. And for such setups a tuning documentation is available at [1].

[1] https://doc.owncloud.org/server/latest/admin_manual/configuration_server/oc_server_tuning.html